ABSTRACT
The emergence of power as a first-class design constraint has fueled the proposal of a growing number of run-time power optimizations. Many of these optimizations trade-off power saving opportunity for a variable performance loss which depends on application characteristics and program phase. Furthermore, the potential benefits of these optimizations are sometimes non-additive, and it can be difficult to identify which combinations of these optimizations to apply. Trial-and-error approaches have been proposed to adaptively tune a processor. However, in a chip multiprocessor, the cost of individually configuring each core under a wide range of optimizations might be prohibitive under simple trial-and-error approaches.
In this work, we introduce an adaptive, multi-optimization power saving strategy for multi-core power management. Specifically, we solve the problem of meeting a global chip wide power budget through run-time adaptation of highly configurable processor cores. Our approach applies analytic modeling to reduce exploration time and decrease the reliance on trial-and-error methods. We also introduce risk evaluation to balance the benefit of various power saving optimizations versus the potential performance loss. Overall, we find that our approach can significantly reduce processor power consumption compared to alternative optimization strategies.
- D. H. Albonesi. Selective cache ways: On-demand cache resource allocation. In International Symposium on Microarchitecture, Nov., 1999. Google Scholar
Digital Library
- R. D. Armstrong, D. S. Kung, P. Sinha, and A. A. Zoltners. A computational study of a multiple-choice napsack algorithm. ACM Transactions on Mathematical Software,9(2), June 1983. Google Scholar
Digital Library
- N. L. Binkert, E. G. Hallnor, and S. K. Reinhardt. Network-oriented full-system simulation using M5. In Sixth Workshop on Computer Architecture Evaluation using Commercial Workloads (CAECW), February 2003.Google Scholar
- D. Brooks and M. Martonosi. Dynamic thermal management for high-performance microprocessors. In Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA-7), January 2001. Google Scholar
Digital Library
- D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A framework for architectural-level power analysis and optimizations. In Proceedings of the 27th International Symposium on Computer Architecture (ISCA-27), June 2000. Google Scholar
Digital Library
- A. Buyuktosunoglu, T. Karkhanis, D. H. Albonesi, and P. Bose. Energy efficient co-adaptive instruction fetch and issue. In Proceedings of the 30th International Symposium on Computer Architecture (ISCA-30), May 2003. Google Scholar
Digital Library
- A. Dhodapkar. Autonomic Management of Adaptive Microarchitectures. PhD thesis, University of Wisconsin-Madison, 2004. Google Scholar
Digital Library
- A. Dhodapkar and J. E. Smith. Managing multi-configuration hardware via dynamic working set analysis. In Proceedings of the 29th International Symposium on Computer Architecture (ISCA-29), May 2002. Google Scholar
Digital Library
- A. Dhodapkar and J. E. Smith. Tuning recongurable microarchitectures for power efficiency. In The 11th Recongurable Architectures Workshop (RAW 2004), held in conjunction with the 18th International Parallel and Distributed Processing Symposium, April 2004.Google Scholar
- A. S. Dhodapkar and J. E. Smith. Comparing program phase detection techniques. In MICRO-36, December 2003. Google Scholar
Digital Library
- S. Eyerman, L. Eeckhout, T. Karkhanis, and J. E. Smith. A performance counter architecture for computing accurate cpi components. In ASPLOS'00, December 2000. Google Scholar
Digital Library
- D. Folegnani and A. Gonzalez. Reducing power consumption of the issue logic. In Proceedings of the 28th International Symposium on Computer Architecture (ISCA-28), July 2001.Google Scholar
- R. Gonzalez and M. Horowitz. Energy Dissipation in General Purpose Microprocessors. IEEE Journal of Solid-State Circuits, 31(9):1277--84, 1996.Google Scholar
Cross Ref
- L. Gwennap. Digital 21264 sets new standard. Microprocessor Report, pages 11--16, Oct. 28, 1996.Google Scholar
- M. Huang, J. Renau, S.-M. Yoo, and J. Torrellas. A framework for dynamic energy efficiency and temperature management. In MICRO-33, December 2000. Google Scholar
Digital Library
- C. Isci, A. Buyuktosunoglu, C.-Y. Cher, P. Bose, and M. Martonosi. An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget. In Proceedings of the 36th International Symposium on Microarchitecture (MICRO-39), December 2006. Google Scholar
Digital Library
- C. Isci and M. Martonosi. Runtime power monitoring in high-end processors: Methodology and empirical data. In Proceedings of the 36th International Symposium on Microarchitecture (MICRO-36), December 2003. Google Scholar
Digital Library
- T. Karkhanis and J. E. Smith. A first-order superscalar processor model. In ISCA-31, June 2004. Google Scholar
Digital Library
- T. Karkhanis and J. E. Smith. Automated design of application-specific superscalar processors. In ISCA-33, June 2006. Google Scholar
Digital Library
- S. Kaxiras and P. Xekalakis. 4t-Decay sensors: a new class of small, fast, robust, and low-power, temperature/leakage sensors. In ISLPED' 04, August 2004. Google Scholar
Digital Library
- R. Kumar, K. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In Proceedings of the 36th International Symposium on Microarchitecture (MICRO-36), December 2003. Google Scholar
Digital Library
- Y. Li, K. Skadron, Z. Hu, and D. Brooks. Performance, energy, and thermal considerations for SMT and CMP architectures. In Eleventh International Conference on High Performance Computer Architectures (HPCA-11), February 2005. Google Scholar
Digital Library
- R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM Journal of Research and Development, 9(2), 1970.Google Scholar
- R. McGowen, C. A. Poirier, C. Bostak, et al. Power and temperature control on a 90-nm itanium family processor. IEEE Journal of Solid-State Circuits, 41(1), January 2006.Google Scholar
Cross Ref
- A. Mericas. Performance monitoring on the power5 microprocessor. In Performance Evaluation and Benchmarking, pages 247--266. CRC Press, 2006.Google Scholar
- F. J. Mesa-Martinez, J. Nayfach-Battilana, and J. Renau. Power model validation through thermal measurements. In ISCA-34, June 2007. Google Scholar
Digital Library
- M. Powell, M. Gomaa, and T. Vijaykumar. Heat-and-run: Leveraging SMT and CMP to manage power density through the operating system. In Proceedings of the Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XI), October 2004. Google Scholar
Digital Library
- M. K. Qureshi and Y. N. Patt. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Micro-39, December 2006. Google Scholar
Digital Library
- R. Rao, A. Srivastava, D. Blaauw, and D. Sylvester. Statistical analysis of subthreshold leakage current for VLSI circuits. IEEE Trans. on VLSI Systems, 12:131--139, Feb. 2004. Google Scholar
Digital Library
- Semiconductor Industry Association. International Technology Roadmap for Semiconductors, 2007. http://www.itrs.net/Links/2007ITRS/Home2007.htm.Google Scholar
- T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In Proceedings Tenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Oct 2002. Google Scholar
Digital Library
- K. Skadron, M. R. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D. Tarjan. Temperature-aware microarchitecture. In Proceedings of the 30th International Symposium on Computer Architecture (ISCA-30), June 2003. Google Scholar
Digital Library
- S. Tam, S. Rusu, J. Chang, S. Vora, B. Cherkauer, and D. Ayers. A 65nm 95W Dual-core multi-threaded Xeon Processor with L3 cache. In IEEE Asian Solid-State Circuits Conference, November 2006.Google Scholar
Cross Ref
Index Terms
Multi-optimization power management for chip multiprocessors
Recommendations
Low power nanoscale buffer management for network on chip routers
GLSVLSI '10: Proceedings of the 20th symposium on Great lakes symposium on VLSINetwork-on-Chip (NoC) is an on-chip communication solution in the future system-on-a-chip (SoC) necessitating high performance operation with low power dissipation. We present a novel dynamic power management technique for low power NoC router buffers ...
System level power-performance trade-offs in embedded systems using voltage and frequency scaling of off-chip buses and memory
ISSS '02: Proceedings of the 15th international symposium on System SynthesisIn embedded systems, off-chip buses and memory (i.e., L2 memory as opposed to the L1 memory which is usually on-chip cache) consume significant power, often more than the processor itself. In this paper, for the case of an embedded system with one ...
Adaptive Power Control with Online Model Estimation for Chip Multiprocessors
As chip multiprocessors (CMPs) become the main trend in processor development, various power and thermal management strategies have recently been proposed to optimize system performance while controlling the power or temperature of a CMP chip to stay ...





Comments