Abstract
The growing needs of emerging applications has posed significant challenges for the design of optimized manycore systems. Network-on-Chip (NoC) enables the integration of a large number of processing elements (PEs) in a single die. To design optimized manycore systems, we need to establish suitable trade-offs among multiple objectives including power, performance, and thermal. Therefore, we consider multi-objective design space exploration (MO-DSE) problems arising in the design of NoC-enabled manycore systems: placement of PEs and communication links to optimize two or more objectives (e.g., latency, energy, and throughput). Existing algorithms to solve MO-DSE problems suffer from scalability and accuracy challenges as size of the design space and the number of objectives grow. In this paper, we propose a novel framework referred as Multi-Objective Optimistic Search (MOOS) that performs adaptive design space exploration using a data-driven model to improve the speed and accuracy of multi-objective design optimization process. We apply MOOS to design both 3D heterogeneous and homogeneous manycore systems using Rodinia, PARSEC, and SPLASH2 benchmark suites. We demonstrate that MOOS improves the speed of finding solutions compared to state-of-the-art methods by up to 13X while uncovering designs that are up to 20% better in terms of NoC. The optimized 3D manycore systems improve the EDP up to 38% when compared to 3D mesh-based designs optimized for the placement of PEs.
- Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine Learning 47, 2--3 (2002), 235--256.Google Scholar
Digital Library
- Sanghamitra Bandyopadhyay, Sriparna Saha, Ujjwal Maulik, and Kalyanmoy Deb. 2008. A simulated annealing-based multiobjective optimization algorithm: AMOSA. IEEE Transactions on Evolutionary Computation (TEC) 12, 3 (2008), 269--283.Google Scholar
Digital Library
- Christian Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University.Google Scholar
Digital Library
- Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, et al. 2011. The Gem5 simulator. ACM SIGARCH Computer Architecture News 39, 2 (2011), 1--7.Google Scholar
Digital Library
- Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In IEEE International Symposium on Workload Characterization (IISWC). IEEE, 44--54.Google Scholar
Digital Library
- Wonje Choi, Karthi Duraisamy, Ryan Gary Kim, Janardhan Rao Doppa, Partha Pratim Pande, Diana Marculescu, and Radu Marculescu. 2018. On-chip communication network for efficient training of deep convolutional networks on heterogeneous manycore systems. IEEE Transactions on Computers (TC) 67, 5 (2018), 672--686.Google Scholar
Cross Ref
- Jason Cong, Jie Wei, and Yan Zhang. 2004. A thermal-driven floorplanning algorithm for 3D ICs. In Proceedings of the 2004 IEEE/ACM International Conference on Computer-aided Design. IEEE Computer Society, 306--313.Google Scholar
Digital Library
- Indraneel Das and John E. Dennis. 1997. A closer look at drawbacks of minimizing weighted sums of objectives for pareto set generation in multicriteria optimization problems. Structural Optimization 14, 1 (1997), 63--69.Google Scholar
Cross Ref
- Sourav Das, Janardhan Rao Doppa, Daehyun Kim, Partha Pratim Pande, and Krishnendu Chakrabarty. 2015. Optimizing 3D NoC design for energy efficiency: A machine learning approach. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 705--712.Google Scholar
Digital Library
- Sourav Das, Janardhan Rao Doppa, Partha Pratim Pande, and Krishnendu Chakrabarty. 2017. Design-space exploration and optimization of an energy-efficient and reliable 3-D small-world network-on-chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) 36, 5 (2017), 719--732.Google Scholar
Digital Library
- Sourav Das, Janardhan Rao Doppa, Partha Pratim Pande, and Krishnendu Chakrabarty. 2017. Monolithic 3D-enabled high performance and energy efficient network-on-chip. In Proceedings of IEEE International Conference on Computer Design (ICCD). 233--240.Google Scholar
Cross Ref
- Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and TAMT Meyarivan. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation (TEC) 6, 2 (2002), 182--197.Google Scholar
Digital Library
- Dominic DiTomaso, Ashif Sikder, Avinash Kodi, and Ahmed Louri. 2017. Machine learning enabled power-aware network-on-chip design. In Proceedings of the IEEE/ACM Conference on Design, Automation 8 Test in Europe (DATE). 1354--1359.Google Scholar
Cross Ref
- Daniel Hernández-Lobato, Jose Hernandez-Lobato, Amar Shah, and Ryan Adams. 2016. Predictive entropy search for multi-objective Bayesian optimization. In Proceedings of International Conference on Machine Learning (ICML). 1492--1501.Google Scholar
- Yong Hu, Daniel Mueller-Gritschneder, and Ulf Schlichtmann. 2018. Wavefront-MCTS: Multi-objective design space exploration of NoC architectures based on Monte Carlo tree search. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD). ACM, 102:1, 102:8.Google Scholar
Digital Library
- Biresh Kumar Joardar, Janardhan Rao Doppa, Partha Pratim Pande, Diana Marculescu, and Radu Marculescu. 2018. Hybrid on-chip communication architectures for heterogeneous manycore systems. In Proceedings of the International Conference on Computer-Aided Design (ICCAD). ACM, 62.Google Scholar
Digital Library
- Biresh Kumar Joardar, Ryan Gary Kim, Janardhan Rao Doppa, and Partha Pratim Pande. 2019. Design and optimization of heterogeneous manycore systems enabled by emerging interconnect technologies: Promises and challenges. In Proceedings of IEEE/ACM International Conference on Design, Automation 8 Test in Europe Conference 8 Exhibition, (DATE). 138--143.Google Scholar
Cross Ref
- Biresh Kumar Joardar, Ryan Gary Kim, Janardhan Rao Doppa, Partha Pratim Pande, Diana Marculescu, and Radu Marculescu. 2018. Learning-based application-agnostic 3D NoC design for heterogeneous manycore systems. IEEE Trans. Comput. 68, 6 (2018), 852--866.Google Scholar
Digital Library
- Ryan Gary Kim, Janardhan Rao Doppa, and Partha Pratim Pande. 2018. Machine learning for design space exploration and optimization of manycore systems. In Proceedings of the International Conference on Computer-Aided Design (ICCAD). IEEE, 48.Google Scholar
Digital Library
- Ryan Gary Kim, Janardhan Rao Doppa, Partha Pratim Pande, Diana Marculescu, and Radu Marculescu. 2018. Machine learning and manycore systems design: A serendipitous symbiosis. IEEE Computer 51, 7 (2018), 66--77.Google Scholar
Cross Ref
- Dongjin Lee, Sourav Das, Dae Hyun Kim, Janardhan Rao Doppa, and Partha Pratim Pande. 2018. Design space exploration of 3D network-on-chip: A sensitivity-based optimization approach. JETC 14, 3 (2018), 32:1--32:26.Google Scholar
- Jingwen Leng, Tayler Hetherington, Ahmed ElTantawy, Syed Gilani, Nam Sung Kim, Tor M. Aamodt, and Vijay Janapa Reddi. 2013. GPUWattch: Enabling energy optimizations in GPGPUs. In ACM SIGARCH Computer Architecture News, Vol. 41. ACM, 487--498.Google Scholar
- Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). ACM, 469--480.Google Scholar
- Weichen Liu, Jiang Xu, Xiaowen Wu, Yaoyao Ye, Xuan Wang, Wei Zhang, Mahdi Nikdast, and Zhehui Wang. 2011. A NoC traffic suite based on real applications. In 2011 IEEE Computer Society Annual Symposium on VLSI. IEEE, 66--71.Google Scholar
Digital Library
- Xiaoxiao Liu, Wei Wen, Xuehai Qian, Hai Li, and Yiran Chen. 2018. Neu-NoC: A high-efficient interconnection network for accelerated neuromorphic systems. In ASP-DAC. IEEE, 141--146.Google Scholar
- Martin Lukasiewycz, Michael Glaß, Christian Haubelt, and Jurgen Teich. 2007. Sat-decoding in evolutionary algorithms for discrete constrained optimization problems. In 2007 IEEE Congress on Evolutionary Computation. IEEE, 935--942.Google Scholar
Cross Ref
- Olav Lysne, Tor Skeie, S.-A. Reinemo, and Ingebjørg Theiss. 2006. Layered routing in irregular networks. IEEE Transactions on Parallel and Distributed Systems (TPDS) 17, 1 (2006).Google Scholar
Digital Library
- Giovanni Mariani, Gianluca Palermo, Vittorio Zaccaria, and Cristina Silvano. 2012. OSCAR: An optimization methodology exploiting spatial correlation in multicore design spaces. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) 31, 5 (2012), 740--753.Google Scholar
Digital Library
- Hirotaka Nakayama, Yeboon Yun, and Min Yoon. 2009. Sequential Approximate Multiobjective Optimization using Computational Intelligence. Springer Science 8 Business Media.Google Scholar
- Umit Y. Ogras and Radu Marculescu. 2006. “ It’s a small world after all”: NoC performance optimization via long-range link insertion. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 14, 7 (2006), 693--706.Google Scholar
Digital Library
- Berkin Ozisikyilmaz, Gokhan Memik, and Alok Choudhary. 2008. Efficient system design space exploration using machine learning techniques. In Proceedings of the 45th Annual Design Automation Conference (DAC). ACM, 966--969.Google Scholar
Digital Library
- Jacopo Panerati, Donatella Sciuto, and Giovanni Beltrame. 2017. Optimization strategies in design space exploration. In Handbook of Hardware/Software Codesign. 189--216.Google Scholar
- Jason Power, Joel Hestness, Marc S. Orr, Mark D. Hill, and David A. Wood. 2014. Gem5-GPU: A heterogeneous CPU-GPU simulator. IEEE Computer Architecture Letters 14, 1 (2014), 34--36.Google Scholar
Digital Library
- Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P. Adams, and Nando De Freitas. 2015. Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE 104, 1 (2015), 148--175.Google Scholar
- Arvind Sridhar, Alessandro Vincenzi, Martino Ruggiero, Thomas Brunschwiler, and David Atienza. 2010. 3D-ICE: Fast compact transient thermal modeling for 3D ICs with inter-tier liquid cooling. In Proceedings of IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 463--470.Google Scholar
Cross Ref
- Christof Teuscher. 2007. Nature-inspired interconnects for self-assembled large-scale network-on-chip designs. Chaos: An Interdisciplinary Journal of Nonlinear Science 17, 2 (2007), 026106.Google Scholar
Cross Ref
- Ke Wang, Ahmed Louri, Avinash Karanth, and Razvan Bunescu. 2019. High-performance, energy-efficient, fault-tolerant network-on-chip design using reinforcement learnin. In 2019 Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE). IEEE, 1166--1171.Google Scholar
- Paul Wettin, Ryan Kim, Jacob Murray, Xinmin Yu, Partha P. Pande, Amlan Ganguly, and Deukhyoun Heoamlan. 2014. Design space exploration for wireless NoCs incorporating irregular network routing. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 33, 11 (2014), 1732--1745.Google Scholar
Cross Ref
- Lyndon While, Philip Hingston, Luigi Barone, and Simon Huband. 2006. A faster algorithm for calculating hypervolume. IEEE Transactions on Evolutionary Computation (TEC) 10, 1 (2006), 29--38.Google Scholar
Digital Library
- Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, and Anoop Gupta. 1995. The SPLASH-2 programs: Characterization and methodological considerations. ACM SIGARCH Computer Architecture News 23, 2 (1995), 24--36.Google Scholar
Digital Library
Index Terms
MOOS: A Multi-Objective Design Space Exploration and Optimization Framework for NoC Enabled Manycore Systems
Recommendations
Express Router Microarchitecture for Triplet-based Hierarchical Interconnection Network
HPCC '12: Proceedings of the 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and SystemsNetwork-on-Chip (NoC) Router has an important impact on the network communication performance. High performance router will help to build a high-throughput, power-efficient and low-latency NoC. However, the existing baseline router of Triplet-based ...
A Latency-Efficient Router Architecture for CMP Systems
DSD '10: Proceedings of the 2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and ToolsAs technology advances, the number of cores in Chip Multi Processor systems (CMPs) and Multi Processor Systems-on-Chips (MPSoCs) keeps increasing. Current test chips and products reach tens of cores, and it is expected to reach hundreds of cores in the ...
Invited paper: Network-on-Chip design and synthesis outlook
With the growing complexity in consumer embedded products, new tendencies forecast heterogeneous Multi-Processor Systems-On-Chip (MPSoCs) consisting of complex integrated components communicating with each other at very high-speed rates. ...






Comments