Abstract
Power and thermal dissipation constrain multicore performance scaling. Modern processors are built such that they could sustain damaging levels of power dissipation, creating a need for systems that can implement processor power caps. A particular challenge is developing systems that can maximize performance within a power cap, and approaches have been proposed in both software and hardware. Software approaches are flexible, allowing multiple hardware resources to be coordinated for maximum performance, but software is slow, requiring a long time to converge to the power target. In contrast, hardware power capping quickly converges to the the power cap, but only manages voltage and frequency, limiting its potential performance. In this work we propose PUPiL, a hybrid software/hardware power capping system. Unlike previous approaches, PUPiL combines hardware's fast reaction time with software's flexibility. We implement PUPiL on real Linux/x86 platform and compare it to Intel's commercial hardware power capping system for both single and multi-application workloads. We find PUPiL provides the same reaction time as Intel's hardware with significantly higher performance. On average, PUPiL outperforms hardware by from 1:18-2:4 depending on workload and power target. Thus, PUPiL provides a promising way to enforce power caps with greater performance than current state-of-the-art hardware-only approaches.
- V. Anagnostopoulou, S. Biswas, H. Saadeldeen, R. Bianchini, T. Yang, D. Franklin, and F. Chong. "Power-Aware Resource Allocation for CPU- and Memory-Intense Internet Services". In: E2DC. 2012.Google Scholar
- K. Bergman, S. Borkar, D. Campbell, W. Carlson, W. Dally, M. Denneau, P. Franzon, W. Harrod, J. Hiller, S. Karp, S. Keckler, D. Klein, R. Lucas, M. Richards, A. Scarpelli, S. Scott, A. Snavely, T. Sterling, R. S. Williams, K. Yelick, K. Bergman, S. Borkar, D. Campbell, W. Carlson, W. Dally, M. Denneau, P. Franzon, W. Harrod, J. Hiller, S. Keckler, D. Klein, P. Kogge, R. S. Williams, and K. Yelick. ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems Peter Kogge, Editor & Study Lead. 2008.Google Scholar
- C. Bienia, S. Kumar, J. P. Singh, and K. Li. "The PARSEC Benchmark Suite: Characterization and Architectural Implications". In: PACT. 2008.Google Scholar
- S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. "Rodinia: A Benchmark Suite for Heterogeneous Computing". In: IISWC. 2009.Google Scholar
- J. Chen and L. K. John. "Predictive coordination of multiple on-chip resources for chip multiprocessors". In: ICS. 2011.Google Scholar
- R. Cochran, C. Hankendi, A. K. Coskun, and S. Reda. "Pack & Cap: adaptive DVFS and thread packing under power caps". In: MICRO. 2011.Google Scholar
- H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le. "RAPL: Memory Power Estimation and Capping". In: ISLPED. 2010.Google Scholar
- Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch, and R. Bianchini. "CoScale: Coordinating CPU and Memory System DVFS in Server Systems". In: MICRO. 2012.Google Scholar
- Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch, and R. Bianchini. "MultiScale: memory system DVFS with multiple memory controllers". In: ISLPED. 2012.Google Scholar
- B. Diniz, D. Guedes, W. Meira Jr., and R. Bianchini. "Limiting the power consumption of main memory". In: ISCA. 2007.Google Scholar
- H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam, and D. Burger. "Dark silicon and the end of multicore scaling". In: ISCA. 2011.Google Scholar
- H. Esmaeilzadeh, T. Cao, X. Yang, S. M. Blackburn, and K. S. McKinley. "Looking Back and Looking Forward: Power, Performance, and Upheaval". In: Commun. ACM 55.7 (July 2012), pp. 105--114.Google Scholar
Digital Library
- S. Eyerman and L. Eeckhout. "Restating the Case for Weighted-IPC Metrics to Evaluate Multiprogram Workload Performance". In: Computer Architecture Letters 13.2 (2014), pp. 93--96. ISSN: 1556-6056. DOI: 10.1109/L-CA.2013.9.Google Scholar
Digital Library
- W. Felter, K. Rajamani, T. Keller, and C. Rusu. "A performance-conserving approach for reducing peak power consumption in server systems". In: ICS. 2005.Google Scholar
- J. Flinn and M. Satyanarayanan. "Energy-aware adaptation for mobile applications". In: SOSP. 1999.Google Scholar
- R. Fonseca, P. Dutta, P. Levis, and I. Stoica. "Quanto: Tracking Energy in Networked Embedded Systems". In: OSDI. 2008.Google Scholar
- A. Gandhi, M. Harchol-Balter, R. Das, C. Lefurgy, and J. Kephart. "Power capping via forced idleness". In: Workshop on Energy-Efficient Design. Austin, TX, 2009.Google Scholar
- J. L. Hellerstein, Y. Diao, S. Parekh, and D. M. Tilbury. Feedback Control of Computing Systems. John Wiley & Sons, 2004.Google Scholar
Digital Library
- U. Hoelzle and L. A. Barroso. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. 1st. Morgan and Claypool Publishers, 2009.Google Scholar
Digital Library
- H. Hoffmann. "JouleGuard: Energy Guarantees for Approximate Applications". In: SOSP. 2015.Google Scholar
- H. Hoffmann, J. Eastep, M. D. Santambrogio, J. E. Miller, and A. Agarwal. "Application heartbeats: a generic interface for specifying program performance and goals in autonomous computing environments". In: ICAC. 2010.Google Scholar
- H. Hoffmann and M. Maggio. "PCP: A Generalized Approach to Optimizing Performance Under Power Constraints through Resource Management". In: ICAC. 2014.Google Scholar
- H. Hoffmann, M. Maggio, M. D. Santambrogio, A. Leva, and A. Agarwal. "A Generalized Software Framework for Accurate and Efficient Managment of Performance Goals". In: EMSOFT. 2013.Google Scholar
- H. Hoffmann, S. Sidiroglou, M. Carbin, S. Misailovic, A. Agarwal, and M. Rinard. "Dynamic Knobs for Responsive Power-Aware Computing". In: ASPLOS. 2011.Google Scholar
- T. Horvath, T. Abdelzaher, K. Skadron, and X. Liu. "Dynamic Voltage Scaling in Multitier Web Servers with End-to-End Delay Control". In: Computers, IEEE Transactions on 56.4 (2007).Google Scholar
Digital Library
- C. Imes, D. H. K. Kim, M. Maggio, and H. Hoffmann. "POET: A Portable Approach to Minimizing Energy Under Soft Real-time Constraints". In: RTAS. 2015.Google Scholar
- T. Instruments. http://www.ti.com/product/ina231.Google Scholar
- S. Iqbal, Y. Liang, and H. Grahn. "ParMiBench - An Open-Source Benchmark for Embedded Multiprocessor Systems". In: Computer Architecture Letters 9.2 (2010). ISSN: 1556-6056. DOI: 10.1109/L-CA.2010.14.Google Scholar
Digital Library
- C. Isci, A. Buyuktosunoglu, C.-Y. Cher, P. Bose, and M. Martonosi. "An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget". In: MICRO. 2006.Google Scholar
- M. Kim, M.-O. Stehr, C. Talcott, N. Dutt, and N. Venkatasubramanian. "xTune: A Formal Methodology for Cross-layer Tuning of Mobile Embedded Systems". In: ACM Trans. Embed. Comput. Syst. 11.4 (Jan. 2013).Google Scholar
Digital Library
- C. Lefurgy, X. Wang, and M. Ware. "Power capping: a prelude to power shifting". In: Cluster Computing 11.2 (2008).Google Scholar
- X. Li, R. Gupta, S. V. Adve, and Y. Zhou. "Cross-component energy management: Joint adaptation of processor and memory". In: ACM Trans. Archit. Code Optim. 4.3 (2007).Google Scholar
Digital Library
- X. Li, Z. Li, Y. Zhou, and S. Adve. "Performance directed energy management for main memory and disks". In: Trans. Storage 1.3 (2005).Google Scholar
Digital Library
- M. Maggio, H. Hoffmann, M. D. S. an d Anant Agarwal, and A. Leva. "Power optimization in embedded systems via feedback control of resource allocation". In: IEEE Transactions on Control Systems Technology (to appear) ().Google Scholar
- D. Meisner, C. M. Sadler, L. A. Barroso, W.-D. Weber, and T. F. Wenisch. "Power management of online data-intensive services". In: ISCA (2011).Google Scholar
- A. Merkel and F. Bellosa. "Balancing power consumption in multiprocessor systems". In: EuroSys. 2006.Google Scholar
- A. Merkel, J. Stoess, and F. Bellosa. "Resource-conscious scheduling for energy efficiency on multi-core processors". In: EuroSys. 2010.Google Scholar
- N. Mishra, H. Zhang, J. D. Lafferty, and H. Hoffmann. "A Probabilistic Graphical Model-based Approach for Minimizing Energy Under Performance Constraints". In: ASPLOS. 2015.Google Scholar
- S. Mohapatra, R. Cornea, H. Oh, K. Lee, M. Kim, N. Dutt, R. Gupta, A. Nicolau, S. Shukla, and N. Venkata- subramanian. "A cross-layer approach for power-performance optimization in distributed mobile systems". In: IPDPS. 2005.Google Scholar
- R. Narayanan, B. Ozisikyilmaz, J. Zambreno, G. Memik, and A. Choudhary. "MineBench: A Benchmark Suite for Data Mining Workloads". In: IISWC. 2006.Google Scholar
- R. Nathuji and K. Schwan. "VirtualPower: coordinated power management in virtualized enterprise systems". In: SOSP. 2007.Google Scholar
- R. Raghavendra, P. Ranganathan, V. Talwar, Z. Wang, and X. Zhu. "No "power" struggles: coordinated multi-level power management for the data center". In: ASPLOS. 2008.Google Scholar
- K. K. Rangan, G.-Y. Wei, and D. Brooks. "Thread motion: fine-grained power management for multi-core systems". In: ISCA. 2009.Google Scholar
- S. Reda, R. Cochran, and A. Coskun. "Adaptive Power Capping for Servers with Multithreaded Workloads". In: Micro, IEEE 32.5 (2012).Google Scholar
- A. Roy, S. M. Rumble, R. Stutsman, P. Levis, D. Mazieres, and N. Zeldovich. "Energy Management in Mobile Devices with the Cinder Operating System". In: EuroSys. 2011.Google Scholar
- R. Sasanka, C. J. Hughes, and S. V. Adve. "Joint Local and Global Hardware Adaptations for Energy". In: ASPLOS. 2002.Google Scholar
- A. Sharifi, S. Srikantaiah, A. K. Mishra, M. Kandemir, and C. R. Das. "METE: meeting end-to-end QoS in multicores through system-wide resource management". In: SIGMETRICS. 2011.Google Scholar
- K. Shen, A. Shriraman, S. Dwarkadas, X. Zhang, and Z. Chen. "Power Containers: An OS Facility for Fine-grained Power and Energy Management on Multicore Servers". In: ASPLOS 2013.Google Scholar
Digital Library
- Y. Shin, K. Shin, P. Kenkare, R. Kashyap, H.-J. Lee, D. Seo, B. Millar, Y. Kwon, R. Iyengar, M.-S. Kim, A. Chowdhury, S.-I. Bae, I. Hon, W. Jeong, A. Lindner, U. Cho, K. Hawkins, J. Son, and S. Hwang. "28nm High- Metal-Gate Heterogeneous Quad-Core CPUs for High-Performance and Energy-Efficient Mobile Application Processor". In: ISSCC. 2013.Google Scholar
- Y. Sinangil, S. M. Neuman, M. E. Sinangi, N. Ickes, G. Bezerra, E. Lau, J. E. Miller, H. Hoffmann, S. Devadas, and A. P. Chandraksan. "A Self-Aware Processor SoC using Energy Monitors Integrated into Power Converters for Self-Adaptation". In: VLSI Symposium. 2014.Google Scholar
Cross Ref
- D. C. Snowdon, E. Le Sueur, S. M. Petters, and G. Heiser. "Koala: A Platform for OS-level Power Management". In: EuroSys. 2009.Google Scholar
- B. Sprunt. "The basics of performance-monitoring hardware". In: IEEE Micro 22.4 (2002).Google Scholar
- M. B. Taylor. "Is Dark Silicon Useful? Harnessing the Four Horesemen of the Coming Dark Silicon Apocalypse". In: Design Automation Conference. 2012.Google Scholar
- E. Team. Key Challenges for Exascale OS/R. Online document, https://collab.mcs.anl.gov/display/exaosr/Challengesl.Google Scholar
- P. Team. Online document, http://icl.cs.utk.edu/papi/.Google Scholar
- V. Vardhan, W. Yuan, A. F. H. III, S. V. Adve, R. Kravets, K. Nahrstedt, D. G. Sachs, and D. L. Jones. "GRACE-2: integrating fine-grained application adaptation with global adaptation for saving energy". In: IJES 4.2 (2009).Google Scholar
- G. Venkatesh, J. Sampson, N. Goulding, S. Garcia, V. Bryksin, J. Lugo-Martinez, S. Swanson, and M. B. Taylor. "Conservation cores: reducing the energy of mature computations". In: ASPLOS. 2010.Google Scholar
- A. Verma, G. Dasgupta, T. K. Nayak, P. De, and R. Kothari. "Server workload analysis for power minimization using consolidation". In: USENIX Annual technical conference. 2009.Google Scholar
- X. Wang, M. Chen, and X. Fu. "MIMO Power Control for High-Density Servers in an Enclosure". In: IEEE Transactions on Parallel and Distributed Systems 21.10 (2010).Google Scholar
Digital Library
- M. Weiser, B. B. Welch, A. J. Demers, and S. Shenker. "Scheduling for Reduced CPU Energy". In: OSDI. 1994.Google Scholar
- A. Weissel, B. Beutel, and F. Bellosa. "Cooperative I/O: A Novel I/O Semantics for Energy-Aware Applications". In: OSDI. 2002.Google Scholar
- J. A. Winter, D. H. Albonesi, and C. A. Shoemaker. "Scalable thread scheduling and global power management for heterogeneous many-core architectures". In: PACT. 2010.Google Scholar
- Q. Wu, P. Juang, M. Martonosi, and D. W. Clark. "Formal online methods for voltage/frequency control in multiple clock domain microprocessors". In: ASPLOS. 2004.Google Scholar
- W. Yuan and K. Nahrstedt. "Energy-efficient soft real-time CPU scheduling for mobile multimedia systems". In: SOSP. 2003.Google Scholar
- X. Zhang, R. Zhong, S. Dwarkadas, and K. Shen. "A Flexible Framework for Throttling-Enabled Multicore Management (TEMM)". In: ICPP. 2012.Google Scholar
- S. Zhuravlev, J. C. Saez, S. Blagodurov, A. Fedorova, and M. Prieto. "Survey of Energy-Cognizant Scheduling Techniques". In: IEEE Trans. Parallel Distrib. Syst. 24.7 (2013), pp. 1447--1464. DOI: 10.1109/TPDS.2012.20. URL: http://doi.ieeecomputersociety.org/10.1109/TPDS.2012.20.Google Scholar
Cross Ref
Index Terms
Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques
Recommendations
Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques
ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating SystemsPower and thermal dissipation constrain multicore performance scaling. Modern processors are built such that they could sustain damaging levels of power dissipation, creating a need for systems that can implement processor power caps. A particular ...
Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques
ASPLOS'16Power and thermal dissipation constrain multicore performance scaling. Modern processors are built such that they could sustain damaging levels of power dissipation, creating a need for systems that can implement processor power caps. A particular ...
Maximizing Heterogeneous Processor Performance Under Power Constraints
Heterogeneous processors (e.g., ARM’s big.LITTLE) improve performance in power-constrained environments by executing applications on the ‘little’ low-power core and move them to the ‘big’ high-performance core when there is available power budget. The ...







Comments