skip to main content
research-article

Dynamic Power and Thermal Management of NoC-Based Heterogeneous MPSoCs

Published:01 February 2014Publication History
Skip Abstract Section

Abstract

Advances in silicon process technology have made it possible to include multiple processor cores on a single die. Billion transistor architectures usually in the form of networks-on-chip present a wide range of challenges in design, microarchitecture, and algorithmic levels with significant impact to system performance and power consumption. In this article, we propose efficient methods and mechanisms that exploit a heterogeneous network-on-chip (NoC) to achieve a power- and thermal-aware coherent system. To this end, we utilize different management techniques which employ dynamic frequency scaling circuitry and power and temperature sensors per node to achieve real-time workload prediction and allocation at node and system level by low-cost threads. The developed heterogeneous multicoprocessing infrastructure is utilized to evaluate diverse policies for power-aware computing in terms of effectiveness and in relation to distributed sensor-conscious management. The proposed reconfigurable architecture supports coprocessor accelerators per node, monitors the program’s power profile on-the-fly, and balances power and thermal behavior at the NoC level. Overall, these techniques form a system exploration methodology using a multi-FPGA emulation platform showing a minimum complexity overhead.

References

  1. Agarwal, K. and Nowka, K. 2007. Dynamic power management by combination of dual static supply voltages. In Proceedings of the 8th International Symposium on Quality Electronic Design (ISQED’07). IEEE Computer Society, Los Alamitos, CA, 85--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. AMD. AMD accelerated processing units. www.amd.com/us/products/technologies/apu/Pages/apu.aspx.Google ScholarGoogle Scholar
  3. Atienza, D., Valle, P. G. D., Paci, G., Poletti, F., Benini, L., Micheli, G. D., Mendias, J. M., and Hermida, R. 2007. HW-SW emulation framework for temperature-aware design in MPSoCs. ACM Trans. Design Automat. Electron. Syst. 12, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bao, M., Andrei, A., Eles, P., Peng, Z., and Eles, P. 2010. Temperature-aware idle time distribution for energy optimization with dynamic voltage scaling. In Proceedings of the Conference on Design, Automation and Test in Europe. 21--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Beign, E., Clermidy, F., Lhermet, H., Miermont, S., Thonnart, Y., Tran, X.-T., Valentian, A., Varreau, D., Vivet, P., Popon, X., and Lebreton, H. 2009. An asynchronous power aware and adaptive noc based circuit. J. Solid-State Circ. 40, 4, 1167--1177.Google ScholarGoogle ScholarCross RefCross Ref
  6. Bellosa, F., Kellner, S., Waitz, M., and Weissel, A. 2003. Event-driven energy accounting for dynamic thermal management. In Proceedings of the Workshop on Compilers and Operating Systems for Low Power (COLP’03).Google ScholarGoogle Scholar
  7. Bhattacharjee, A., Contreras, G., and Martonosi, M. 2008. Full-system chip multiprocessor power evaluations using FPGA-based emulation. In Proceedings of the 13th International Symposium on Low Power Electronics and Design. 335--340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Brooks, D. and Martonosi, M. 2001. Dynamic thermal management for high-performance microprocessors. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture (HPCA). 304--309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Brooks, D., Bose, P., and Martonosi, M. 2004. Power-performance simulation: Design and validation strategies. SIGMETRICS Perf. Eval. Rev. 31, 4, 13--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Broyles, M., Franscois, C., and Geissler, A. 2013. IBM EnergyScale for POWER7 processor-based systems. www-03.ibm.com/systems/power/hardware/whitepapers/energyscale7.html, March 2013.Google ScholarGoogle Scholar
  11. Carta, S., Acquaviva, A., Del Valle, P. G., Atienza, D., De Micheli, G., Rincon, F., Benini, L., and Mendias, J. M. 2007. Multi-processor operating system emulation framework with thermal feedback for systems-on-chip. In Proceedings of the 17th ACM Great Lakes Symposium on VLSI. 311--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Che, W. and Chatha, K. S. 2010. Scheduling of synchronous data flow models on scratchpad memory based embedded processors. In Proceedings of the International Conference on Computer-Aided Design. 205--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chen, S., Kozuch, M., Strigkos, T., Falsafi, B., Gibbons, P. B., Mowry, T. C., Ramachandran, V., Ruwase, O., Ryan, M., and Vlachos, E. 2008. Flexible hardware acceleration for instruction-grain program monitoring. In Proceedings of the 35th Annual International Symposium on Computer Architecture (ISCA’08). IEEE Computer Society, Los Alamitos, CA, 377--388. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chung, E. S., Papamichael, M. K., Nurvitadhi, E., Hoe, J. C., Mai, K., and Falsafi, B. 2009. Protoflex: Towards scalable, full-system multiprocessor simulations using fpgas. ACM Trans. Reconfig. Technol. Syst. 2, 2, 15:1--15:32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Cochran, R. and Reda, S. 2010. Consistent runtime thermal prediction and control through workload phase detection. In Proceedings of the 47th Design Automation Conference. 62--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Coskun, A. K., Rosing, T. S., and Gross, K. C. 2009. Utilizing predictors for efficient thermal management in multiprocessor SoCs. Trans. Comp.-Aided Des. Integ. Cir. Sys. 28, 10, 1503--1516. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dalton, M., Kannan, H., and Kozyrakis, C. 2007. Raksha: A flexible information flow architecture for software security. In Proceedings of the 34th Annual International Symposium on Computer Architecture (ISCA’07). ACM, New York, 482--493. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Donald, J. and Martonosi, M. 2006. Techniques for multicore thermal management: Classification and new exploration. In Proceedings of the 33rd International Symposium on Computer Architecture. 78--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Draper, N. and Smith, H. 1998. Applied Regression Analysis. Wiley-Interscience.Google ScholarGoogle Scholar
  20. Ghodrat, Mohammad, A., Lahiri, K., and Raghunathan, A. 2007. Accelerating system-on-chip power analysis using hybrid power estimation. In Proceedings of the 44th Annual Design Automation Conference. 883--886. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Gschwind, M., Hofstee, H. P., Flachs, B., Hopkins, M., Watanabe, Y., and Yamazaki, T. 2006. Synergistic processing in cell’s multicore architecture. IEEE Micro 26, 2, 10--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Guthaus, M. R., Ringenberg, J. S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R. B. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the Workload Characterization (WWC-4). 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Heo, S., Barr, K., and Asanović, K. 2003. Reducing power density through activity migration. In Proceedings of the International Symposium on Low Power Electronics and Design. 217--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hsu, C.-H. and Feng, W.-c. 2005. A power-aware run-time system for high-performance computing. In Proceedings of the ACM/IEEE Conference on Supercomputing. p. 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hussein, J., Klein, M., and Hart, M. 2011. Lowering power at 28 nm with Xilinx 7 series FPGAs. White paper WP389 (v1.1).Google ScholarGoogle Scholar
  26. Intel. 2012. The Intel Xeon Phi coprocessor: Parallel processing, unparalleled discovery. www.intel.com/content/www/us/en/high-performance-computing/high-performance-xeon-phi-coprocessor-brief.html.Google ScholarGoogle Scholar
  27. Intel Labs Single-chip Cloud Computer. 2009. http://techresearch.intel.com/newsdetail.aspx?Id=17#SCC.Google ScholarGoogle Scholar
  28. Isci, C., Contreras, G., and Martonosi, M. 2006. Live, runtime phase monitoring and prediction on real systems with application to dynamic power management. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. 359--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kim, J. and Kim, H. 2009. Router microarchitecture and scalability of ring topology in on-chip networks. In Proceedings of the 2nd International Workshop on Network on Chip Architectures. 5--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kim, W., Gupta, M., Wei, G.-Y., and Brooks, D. 2008. System level analysis of fast, per-core DVFS using on-chip switching regulators. In Proceedings of the 14th International Symposium on High-Performance Computer Architecture. 123--134.Google ScholarGoogle Scholar
  31. Kornaros, G. 2010. Application Specific Customizable Embedded Systems. In Multi-Core Embedded Systems, Chapter 2, CRC Press.Google ScholarGoogle Scholar
  32. Kornaros, G. and Pnevmatikatos, D. 2011. Hardware-assisted dynamic power and thermal management in multi-core socs. In Proceedings of the 21st Edition of the Great Lakes Symposium on VLSI. 115--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kotla, R., Ghiasi, G., Keller, T., and Rawson, F. 2005. Scheduling processor voltage and frequency in server and cluster systems. In Proceedings of the Workshop on High-Performance, Power-Aware Computing (HP-HPAC). 234.2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Kumar, A., Shang, L., Peh, L.-S., and Jha, N. 2008. System-level dynamic thermal management for high-performance microprocessors. IEEE Trans. Comput.-Aided Des. Integ. Circ. Syst. 27, 1, 96 --108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Lee, H. G., Chang, N., Ogras, U. Y., and Marculescu, R. 2008. On-chip communication architecture exploration: A quantitative evaluation of point-to-point, bus, and network-on-chip approaches. ACM Trans. Des. Autom. Electron. Syst. 12, 3, 23:1--23:20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Li, M., Sasanka, R., Adve, S. V., kuang Chen, Y., and Debes, E. 2005. The ALPBench benchmark suite for complex multimedia applications. In Proceedings of the IEEE International Symposium on Workload Characterization. 34--45.Google ScholarGoogle Scholar
  37. Li, Xinyu and Hammami, Omar. Fast design productivity for embedded multiprocessor through multi-FPGA emulation: The case of a 48-way multiprocessor with NoC. http://www.design-reuse.com/articles/21324/multi-fpga-emulation-multiprocessor-noc.html.Google ScholarGoogle Scholar
  38. Maxeler Technologies. MPC-X series. www.maxeler.com/products/mpc-xseries.Google ScholarGoogle Scholar
  39. Merkel, A. and Bellosa, F. 2005. Event-driven thermal management in SMP systems. In Proceedings of the 2nd Workshop on Temperature-Aware Computer Systems (TACS).Google ScholarGoogle Scholar
  40. Mulas, F., Atienza, D., Acquaviva, A., Carta, S., Benini, L., and De Micheli, G. 2009. Thermal balancing policy for multiprocessor stream computing platforms. Trans. Comp.-Aided Des. Integ. Circ. Syst. 28, 1870--1882. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Ogras, U. Y., Marculescu, R., Choudhary, P., and Marculescu, D. 2007. Voltage-frequency island partitioning for GALS-based networks-on-chip. In Proceedings of the 44th Annual Conference on Design Automation. 110--115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Ogras, U. Y., Marculescu, R., Marculescu, D., and Jung, E. G. 2009. Design and management of voltage-frequency island partitioned networks-on-chip. IEEE Trans. VLSI Syst. 17, 3, 330--341. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Ou, J. and Prasanna, V. K. 2008. A cooperative management scheme for power efficient implementations of real-time operating systems on soft processors. IEEE Trans. VLSI Syst. 16, 45--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Pham, D., Aipperspach, T., Boerstler, D., Bolliger, M., Chaudhry, R., Cox, D., Harvey, P., Harvey, P., Hofstee, H., Johns, C., Kahle, J., Kameyama, A., Keaty, J., Masubuchi, Y., Pham, M., Pille, J., Posluszny, S., Riley, M., Stasiak, D., Suzuoki, M., Takahashi, O., Warnock, J., Weitzel, S., Wendel, D., and Yazawa, K. 2006. Overview of the architecture, circuit design, and physical implementation of a first-generation cell processor. IEEE J. Solid-State Circ. 41, 1, 179--196.Google ScholarGoogle ScholarCross RefCross Ref
  45. Powell, M. D., Gomaa, M., and Vijaykumar, T. N. 2004. Heat-and-run: Leveraging SMT and CMP to manage power density through the operating system. In Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems. 260--270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Rangan, K. K., Wei, G.-Y., and Brooks, D. 2009. Thread motion: Fine-grained power management for multi-core systems. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). ACM, New York, 302--313. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Rotem, E., Mendelson, A., Ginosar, R., and Weiser, U. 2009. Multiple clock and voltage domains for chip multi processors. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. 459--468. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Rudin, W. 1987. Real and Complex Analysis 3rd Ed. McGraw-Hill, Inc., New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Srinivasan, K. and Karam, C. S. 2005. A technique for low energy mapping and routing in network-on-chip architectures. In Proceedings of the International Symposium on Low Power Electronics and Design. 387--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Talpes, E. and Marculescu, D. 2005. Toward a multiple clock/voltage island design style for power-aware processors. IEEE Trans. VLSI Syst. 13, 591--603. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Venkataramani, G., Roemer, B., Solihin, Y., and Prvulovic, M. 2007. Memtracker: Efficient and programmable support for memory access monitoring and debugging. In Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture (HPCA’07). 273--284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Wang, Y., Ma, K., and Wang, X. 2009. Temperature-constrained power control for chip multiprocessors with online model estimation. In Proceedings of the International Symposium on Computer Architecture (ISCA). 314--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Wawrzynek, J., Patterson, D., Oskin, M., Lu, S.-L., Kozyrakis, C., Hoe, J. C., Chiou, D., and Asanovic, K. 2007. Ramp: Research accelerator for multiple processors. IEEE Micro 27, 2, 46--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Wu, Q., Juang, P., Martonosi, M., and Clark, D. W. 2005. Voltage and frequency control with adaptive reaction time in multiple-clock-domain processors. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture. 178--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Xilinx, Inc. a. Aurora 8B/10B for Virtex-4 FX FPGA User Guide. UG061, v3.1. www.xilinx.com/support/documentation/ip_documentation/virtex_4fx_aurora_8b10b_ug061.pdf, 2009.Google ScholarGoogle Scholar
  56. Xilinx, Inc. b. Xilinx demonstrates industry’s first scalable 3-D graphics hardware accelerator for automotive applications. www.xilinx.com/prs_rls/2007/end_markets/0703_xylon3dCES.htm,2007.Google ScholarGoogle Scholar
  57. Xilinx, Inc. c. Xilinx SDR radio kit wins 2006 portable design editor’s choice award. www.xilinx.com/prs_rls/2007/xil_corp/0733_pdawards.htm, Feb.2007.Google ScholarGoogle Scholar
  58. Xilinx, Inc. d. Xilinx spartan-3e fpgas enable JVC’s latest professional broadcast hdv camera-recorder GY-HD250. www.xilinx.com/prs_rls/design_win/06123jvc.htm, Nov. 2006.Google ScholarGoogle Scholar
  59. Yeo, I., Liu, C. C., and Kim, E. J. 2008. Predictive dynamic thermal management for multicore systems. In Proceedings of the 45th annual Design Automation Conference (DAC’08). 734--739. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Yu, C. and Petrov, P. 2010. Adaptive multi-threading for dynamic workloads in embedded multiprocessors. In Proceedings of the 23rd Symposium on Integrated Circuits and System Design. 67--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Yuffe, M., Knoll, E., Mehalel, M., Shor, J., and Kurts, T. 2011. A fully integrated multi-CPU, GPU and memory controller 32nm processor. In Proceedings of the IEEE International Solid-State Circuits Conference. 264--265.Google ScholarGoogle Scholar
  62. Zhang, X., Shen, K., Dwarkadas, S., and Zhong, R. 2010. An evaluation of per-chip nonuniform frequency scaling on multicores. In Proceedings of the USENIX Annual Technical Conference. 19--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Zhou, X., Yang, J., Chrobak, M., and Zhang, Y. 2010. Performance-aware thermal management via task scheduling. ACM Trans. Archit. Code Optim. 7, 5:1--5:31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Zhu, Y. and Albonesi, D. 2006. Synergistic temperature and energy management in GALS processor architectures. In Proceedings of the International Symposium on Low Power Electronics and Design. 55--60. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dynamic Power and Thermal Management of NoC-Based Heterogeneous MPSoCs

          Recommendations

          Reviews

          Amitabha Roy

          With increasing transistor counts, multiprocessor systems-on-chip (MPSoCs) provide system and application builders flexibility to deliver solutions that provide good performance while keeping overall system power and temperature under control. This paper deals with the problem of power and temperature management in complex MPSoCs. The authors focus in particular on the globally asynchronous locally synchronous approach. They deal with an MPSoC design that divides the processor into independent islands, each of which manages its voltage and clock frequency independently. Each island consists of a management core and a number of accelerator cores that are interconnected using message-passing links and synchronized through shared memory on the island. Islands are interconnected with a network-on-chip (NoC) that crosses clock and voltage domains. The key contribution of the paper is in the development of a field-programmable gate array (FPGA)-based platform to emulate such a system, in conjunction with thermal and power sensors that provide continuous accurate measurements of power and temperature. Some of the evaluation deals with showing that the FPGA-based emulation matches the simulation of such a system in terms of power and temperature for the individual islands while providing orders of magnitude better simulation throughput. This immediately opens the door to experimenting with and developing global solutions that manage the scheduling of applications while keeping chip temperature and power within budget. The paper makes for an interesting read for anyone interested in experimenting with and developing power and temperature management solutions for MPSoCs, an active research area. Online Computing Reviews Service

          Access critical reviews of Computing literature here

          Become a reviewer for Computing Reviews.

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!