Abstract
Computational sprinting is a class of mechanisms that boost performance but dissipate additional power. We describe a sprinting architecture in which many, independent chip multiprocessors share a power supply and sprints are constrained by the chips’ thermal limits and the rack’s power limits. Moreover, we present the computational sprinting game, a multi-agent perspective on managing sprints. Strategic agents decide whether to sprint based on application phases and system conditions. The game produces an equilibrium that improves task throughput for data analytics workloads by 4--6× over prior greedy heuristics and performs within 90% of an upper bound on throughput from a globally optimized policy.
- 2001. Dynamic thermal management for high-performance microprocessors. In Proceedings of the 7th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE Computer Society, 171--182. Google Scholar
Digital Library
- 2002. Control-theoretic techniques and thermal-RC modeling for accurate and localized dynamic thermal management. In Proceedings of the 8th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE Computer Society, 17--28. Google Scholar
Digital Library
- Sachin Adlakha and Ramesh Johari. 2013. Mean field equilibrium in dynamic games with strategic complementarities. Operat. Res. 61, 4 (2013), 971--989.Google Scholar
Cross Ref
- Sachin Adlakha, Ramesh Johari, and Gabriel Y. Weintraub. 2013. Equilibria of dynamic games with many players: Existence, approximation, and market structure. J. Econ. Theory (2013).Google Scholar
- Sachin Adlakha, Ramesh Johari, Gabriel Y. Weintraub, and Andrea Goldsmith. 2010. On oblivious equilibrium in large population stochastic games. In Proceedings of the 49th IEEE Conference on Decision and Control (CDC). IEEE, 3117--3124.Google Scholar
Cross Ref
- Allen-Bradley. 2016. Bulletin 1489 UL489 Circuit Breakers. (2016). http://literature.rockwellautomation.com/idc/groups/literature/documents/td/1489-td001_-en-p.pdf Online; accessed: 12-29-2016Google Scholar
- Ametek. 2016. Selection and Sizing of Batteries for UPS Backup. (2016). http://www.solidstatecontrolsinc. com/knowledgecenter/∼/media/85b8e51754c446bda1f38449f444471c.ashx Online; accessed: 12-29-2016Google Scholar
- Hitesh Ballani, Paolo Costa, Thomas Karagiannis, and Ant Rowstron. 2011. Towards predictable datacenter networks. In Proceedings of the ACM SIGCOMM Conference (SIGCOMM). ACM, 242--253. Google Scholar
Digital Library
- Luiz André Barroso, Jimmy Clidaras, and Urs Hölzle. 2013. The datacenter as a computer: An introduction to the design of warehouse-scale machines. Synth. Lect. Comput. Arch. 8, 3 (2013), 1--154.Google Scholar
Cross Ref
- CSB Battery. 2016. EVH12150. (2016). http://www.csb-battery.com.tw/english/01product/02detail.php Online; accessed: 12-29-2016Google Scholar
- Anton Beloglazov, Jemal Abawajy, and Rajkumar Buyya. 2012. Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future Gen. Comput. Syst. 28, 5 (2012), 755--768. Google Scholar
Digital Library
- Josep Ll Berral, Íñigo Goiri, Ramón Nou, Ferran Julià, Jordi Guitart, Ricard Gavaldà, and Jordi Torres. 2010. Towards energy-aware scheduling in data centers using machine learning. In Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking. ACM, 215--224. Google Scholar
Digital Library
- Arka A. Bhattacharya, David Culler, Aman Kansal, Sriram Govindan, and Sriram Sankar. 2013. The need for speed and stability in data center power capping. Sust. Comput.: Inf. Syst. 3, 3 (2013), 183--193.Google Scholar
Cross Ref
- Jeffrey S. Chase, Darrell C. Anderson, Prachi N. Thakar, Amin M. Vahdat, and Ronald P. Doyle. 2001. Managing energy and server resources in hosting centers. In Proceedings of the 18th Symposium on Operating Systems Principles (SOSP). ACM, 103--116. Google Scholar
Digital Library
- Gong Chen, Wenbo He, Jie Liu, Suman Nath, Leonidas Rigas, Lin Xiao, and Feng Zhao. 2008. Energy-aware server provisioning and load dispatching for connection-intensive internet services. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation (NSDI). USENIX Association, 337--350. Google Scholar
Digital Library
- Xiaobo Fan, Wolf-Dietrich Weber, and Luiz Andre Barroso. 2007. Power provisioning for a warehouse-sized computer. In Proceedings of the 34th Annual International Symposium on Computer Architecture (ISCA). ACM, 13--23. Google Scholar
Digital Library
- Xing Fu, Xiaorui Wang, and Charles Lefurgy. 2011. How much power oversubscription is safe and allowed in data centers. In Proceedings of the 8th ACM International Conference on Autonomic Computing (ICAC). ACM, 21--30. Google Scholar
Digital Library
- Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, and Ion Stoica. 2011. Dominant resource fairness: Fair allocation of multiple resource types. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI). USENIX Association, 323--336. Google Scholar
Digital Library
- Íñigo Goiri, Thu D. Nguyen, Ricardo Bianchini, and Íñigo Goiri Presa. 2015. CoolAir: Temperature-and variation-aware management for free-cooled datacenters. In Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 253--265. Google Scholar
Digital Library
- Sriram Govindan, Anand Sivasubramaniam, and Bhuvan Urgaonkar. 2011. Benefits and limitations of tapping into stored energy for datacenters. In Proceeding of the 38th Annual International Symposium on Computer Architecture (ISCA). IEEE, 341--351. Google Scholar
Digital Library
- Sriram Govindan, Di Wang, Anand Sivasubramaniam, and Bhuvan Urgaonkar. 2012. Leveraging stored energy for handling power emergencies in aggressively provisioned datacenters. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 75--86. Google Scholar
Digital Library
- Marisabel Guevara, Benjamin Lubin, and Benjamin C. Lee. 2013. Navigating heterogeneous processors with market mechanisms. In Proceeding of the 19th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 95--106. Google Scholar
Digital Library
- Marisabel Guevara, Benjamin Lubin, and Benjamin C. Lee. 2014. Strategies for anticipating risk in heterogeneous system design. In Proceeding of the 20th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 154--164.Google Scholar
- Ramakrishna Gummadi, Ramesh Johari, and Jia Yuan Yu. 2012. Mean field equilibria of multiarmed bandit games. In Proceedings of the 13th ACM Conference on Electronic Commerce (EC). ACM, 655--655. Google Scholar
Digital Library
- Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, and Ion Stoica. 2011. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI). USENIX Association, 295--308. Google Scholar
Digital Library
- Chang-Hong Hsu, Yunqi Zhang, Michael Laurenzano, David Meisner, Thomas Wenisch, Jason Mars, Lingjia Tang, Ronald G. Dreslinski, and others. 2015. Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting. In Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 271--282.Google Scholar
Cross Ref
- Krishnamurthy Iyer, Ramesh Johari, and Mukund Sundararajan. 2011. Mean field equilibria of dynamic auctions with learning. ACM SIGecom Exch. 10, 3 (2011), 10--14. Google Scholar
Digital Library
- Mael Le Treust and Samson Lasaulce. 2010. A repeated game formulation of energy-efficient decentralized power control. IEEE Trans. Wireless Commun. 9, 9 (2010), 2860--2869. Google Scholar
Digital Library
- Yingmin Li, Benjamin Lee, David Brooks, Zhigang Hu, and Kevin Skadron. 2006. CMP design space exploration subject to physical constraints. In Proceedings of the 12th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 17--28.Google Scholar
Cross Ref
- M. Lichman. 2013. UCI Machine Learning Repository. (2013). http://archive.ics.uci.edu/ml.Google Scholar
- Minghong Lin, Adam Wierman, Lachlan L. H. Andrew, and Eno Thereska. 2013. Dynamic right-sizing for power-proportional data centers. IEEE/ACM Trans. Netw. 21, 5 (2013), 1378--1391. Google Scholar
Digital Library
- Zhenhua Liu, Adam Wierman, Yuan Chen, Benjamin Razon, and Niangjun Chen. 2013. Data center demand response: Avoiding the coincident peak via workload shifting and local generation. Perf. Eval. 70, 10 (2013), 770--791. Google Scholar
Digital Library
- David Lo, Liqun Cheng, Rama Govindaraju, Luiz André Barroso, and Christos Kozyrakis. 2014. Towards energy proportionality for large-scale latency-critical workloads. In Proceeding of the 41st Annual International Symposium on Computer Architecture (ISCA). IEEE, 301--312. Google Scholar
Digital Library
- F. Maxwell Harper and Joseph A. Konstan. 2015. The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4 (2015), 19. Google Scholar
Digital Library
- Robert Meusel, Sebastiano Vigna, Oliver Lehmberg, and Christian Bizer. 2012. Web Data Commons - Hyperlink Graphs. (2012). http://webdatacommons.org/hyperlinkgraph/index.html Online; accessed: 12-29-2016Google Scholar
- Zwane Mwaikambo, Ashok Raj, Rusty Russell, Joel Schopp, and Srivatsa Vaddagiri. 2004. Linux kernel hotplug CPU support. In Linux Symposium, Vol. 2.Google Scholar
- Arun Raghavan. 2013. Computational Sprinting: Exceeding Sustainable Power in Thermally Constrained Systems. Ph.D. Dissertation. University of Pennsylvania. Google Scholar
Digital Library
- Arun Raghavan, Laurel Emurian, Lei Shao, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2013a. Computational sprinting on a hardware/software testbed. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 155--166. Google Scholar
Digital Library
- Arun Raghavan, Laurel Emurian, Lei Shao, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2013b. Utilizing dark silicon to save energy with computational sprinting. IEEE Micro 33, 5 (2013), 20--28. Google Scholar
Digital Library
- Arun Raghavan, Yixin Luo, Anuj Chandawalla, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2012. Computational sprinting. In Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE Computer Society, 1--12. Google Scholar
Digital Library
- Lei Shao, Arun Raghavan, Laurel Emurian, Marios C. Papaefthymiou, Thomas F. Wenisch, Milo M. K. Martin, and Kevin P. Pipe. 2014. On-chip phase change heat sinks designed for computational sprinting. In Proceedings of the 30th Annual Semiconductor Thermal Measurement and Management Symposium (SEMI-THERM). IEEE, 29--34.Google Scholar
- Matt Skach, Manish Arora, Chang-Hong Hsu, Qi Li, Dean Tullsen, Lingjia Tang, and Jason Mars. 2015. Thermal time shifting: Leveraging phase change materials to reduce cooling costs in warehouse-scale computers. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA). IEEE, 439--449. Google Scholar
Digital Library
- Thannirmalai Somu Muthukaruppan, Anuj Pathania, and Tulika Mitra. 2014. Price theory based power management for heterogeneous multi-cores. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). ACM, 161--176. Google Scholar
Digital Library
- J. Stamper, A. Niculescu-Mizil, S. Ritter, G. J. Gordon, and K. R. Koedinger. 2010. Algebra I 2006-2007. Challenge data set from KDD Cup 2010 Educational Data Mining Challenge. (2010). http://pslcdatashop.web.cmu.edu/KDDCup/downloads.jsp.Google Scholar
- Fabien Volle, Suresh V. Garimella, Mark Juds, and others. 2010. Thermal management of a soft starter: Transient thermal impedance model and performance enhancements using phase change materials. IEEE Trans. Power Electron. 25, 6 (2010), 1395--1405.Google Scholar
Cross Ref
- Xiaorui Wang, Ming Chen, Charles Lefurgy, and Tom W. Keller. 2012. Ship: A scalable hierarchical power control architecture for large-scale data centers. IEEE Trans. Parallel Distrib. Syst. 23, 1 (2012), 168--176. Google Scholar
Digital Library
- Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, Vol. 10. 10. Google Scholar
Digital Library
- Seyed Majid Zahedi and Benjamin C. Lee. 2014. REF: Resource elasticity fairness with sharing incentives for multiprocessors. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 145--160. Google Scholar
Digital Library
- Seyed Majid Zahedi and Benjamin C. Lee. 2015. Sharing incentives and fair division for multiprocessors. IEEE Micro 35, 3 (2015), 92--100.Google Scholar
Cross Ref
- Wenli Zheng and Xiaorui Wang. 2015. Data center sprinting: Enabling computational sprinting at the data center level. In Proceedings of the 35th International Conference on Distributed Computing Systems (ICDCS). IEEE, 175--184.Google Scholar
Cross Ref
Index Terms
Computational Sprinting: Architecture, Dynamics, and Strategies
Recommendations
The Computational Sprinting Game
ASPLOS'16Computational sprinting is a class of mechanisms that boost performance but dissipate additional power. We describe a sprinting architecture in which many, independent chip multiprocessors share a power supply and sprints are constrained by the chips' ...
An experimental evaluation of real-time DVFS scheduling algorithms
SYSTOR '12: Proceedings of the 5th Annual International Systems and Storage ConferenceWe implement and experimentally evaluate the timeliness and energy consumption behaviors of fourteen Real-Time Dynamic Voltage and Frequency Scaling (RT-DVFS) schedulers on two hardware platforms. The schedulers include CC-EDF, LA-EDF, REUA, DRA, and ...
NoC-Sprinting: Interconnect for Fine-Grained Sprinting in the Dark Silicon Era
DAC '14: Proceedings of the 51st Annual Design Automation ConferenceThe rise of utilization wall limits the number of transistors that can be powered on in a single chip and results in a large region of dark silicon. While such phenomenon has led to disruptive innovation in computation, little work has been done for the ...






Comments