skip to main content
research-article
Public Access

Computational Sprinting: Architecture, Dynamics, and Strategies

Published:09 January 2017Publication History
Skip Abstract Section

Abstract

Computational sprinting is a class of mechanisms that boost performance but dissipate additional power. We describe a sprinting architecture in which many, independent chip multiprocessors share a power supply and sprints are constrained by the chips’ thermal limits and the rack’s power limits. Moreover, we present the computational sprinting game, a multi-agent perspective on managing sprints. Strategic agents decide whether to sprint based on application phases and system conditions. The game produces an equilibrium that improves task throughput for data analytics workloads by 4--6× over prior greedy heuristics and performs within 90% of an upper bound on throughput from a globally optimized policy.

References

  1. 2001. Dynamic thermal management for high-performance microprocessors. In Proceedings of the 7th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE Computer Society, 171--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2002. Control-theoretic techniques and thermal-RC modeling for accurate and localized dynamic thermal management. In Proceedings of the 8th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE Computer Society, 17--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Sachin Adlakha and Ramesh Johari. 2013. Mean field equilibrium in dynamic games with strategic complementarities. Operat. Res. 61, 4 (2013), 971--989.Google ScholarGoogle ScholarCross RefCross Ref
  4. Sachin Adlakha, Ramesh Johari, and Gabriel Y. Weintraub. 2013. Equilibria of dynamic games with many players: Existence, approximation, and market structure. J. Econ. Theory (2013).Google ScholarGoogle Scholar
  5. Sachin Adlakha, Ramesh Johari, Gabriel Y. Weintraub, and Andrea Goldsmith. 2010. On oblivious equilibrium in large population stochastic games. In Proceedings of the 49th IEEE Conference on Decision and Control (CDC). IEEE, 3117--3124.Google ScholarGoogle ScholarCross RefCross Ref
  6. Allen-Bradley. 2016. Bulletin 1489 UL489 Circuit Breakers. (2016). http://literature.rockwellautomation.com/idc/groups/literature/documents/td/1489-td001_-en-p.pdf Online; accessed: 12-29-2016Google ScholarGoogle Scholar
  7. Ametek. 2016. Selection and Sizing of Batteries for UPS Backup. (2016). http://www.solidstatecontrolsinc. com/knowledgecenter/∼/media/85b8e51754c446bda1f38449f444471c.ashx Online; accessed: 12-29-2016Google ScholarGoogle Scholar
  8. Hitesh Ballani, Paolo Costa, Thomas Karagiannis, and Ant Rowstron. 2011. Towards predictable datacenter networks. In Proceedings of the ACM SIGCOMM Conference (SIGCOMM). ACM, 242--253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Luiz André Barroso, Jimmy Clidaras, and Urs Hölzle. 2013. The datacenter as a computer: An introduction to the design of warehouse-scale machines. Synth. Lect. Comput. Arch. 8, 3 (2013), 1--154.Google ScholarGoogle ScholarCross RefCross Ref
  10. CSB Battery. 2016. EVH12150. (2016). http://www.csb-battery.com.tw/english/01product/02detail.php Online; accessed: 12-29-2016Google ScholarGoogle Scholar
  11. Anton Beloglazov, Jemal Abawajy, and Rajkumar Buyya. 2012. Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future Gen. Comput. Syst. 28, 5 (2012), 755--768. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Josep Ll Berral, Íñigo Goiri, Ramón Nou, Ferran Julià, Jordi Guitart, Ricard Gavaldà, and Jordi Torres. 2010. Towards energy-aware scheduling in data centers using machine learning. In Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking. ACM, 215--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Arka A. Bhattacharya, David Culler, Aman Kansal, Sriram Govindan, and Sriram Sankar. 2013. The need for speed and stability in data center power capping. Sust. Comput.: Inf. Syst. 3, 3 (2013), 183--193.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jeffrey S. Chase, Darrell C. Anderson, Prachi N. Thakar, Amin M. Vahdat, and Ronald P. Doyle. 2001. Managing energy and server resources in hosting centers. In Proceedings of the 18th Symposium on Operating Systems Principles (SOSP). ACM, 103--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gong Chen, Wenbo He, Jie Liu, Suman Nath, Leonidas Rigas, Lin Xiao, and Feng Zhao. 2008. Energy-aware server provisioning and load dispatching for connection-intensive internet services. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation (NSDI). USENIX Association, 337--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Xiaobo Fan, Wolf-Dietrich Weber, and Luiz Andre Barroso. 2007. Power provisioning for a warehouse-sized computer. In Proceedings of the 34th Annual International Symposium on Computer Architecture (ISCA). ACM, 13--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Xing Fu, Xiaorui Wang, and Charles Lefurgy. 2011. How much power oversubscription is safe and allowed in data centers. In Proceedings of the 8th ACM International Conference on Autonomic Computing (ICAC). ACM, 21--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, and Ion Stoica. 2011. Dominant resource fairness: Fair allocation of multiple resource types. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI). USENIX Association, 323--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Íñigo Goiri, Thu D. Nguyen, Ricardo Bianchini, and Íñigo Goiri Presa. 2015. CoolAir: Temperature-and variation-aware management for free-cooled datacenters. In Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 253--265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Sriram Govindan, Anand Sivasubramaniam, and Bhuvan Urgaonkar. 2011. Benefits and limitations of tapping into stored energy for datacenters. In Proceeding of the 38th Annual International Symposium on Computer Architecture (ISCA). IEEE, 341--351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Sriram Govindan, Di Wang, Anand Sivasubramaniam, and Bhuvan Urgaonkar. 2012. Leveraging stored energy for handling power emergencies in aggressively provisioned datacenters. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 75--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Marisabel Guevara, Benjamin Lubin, and Benjamin C. Lee. 2013. Navigating heterogeneous processors with market mechanisms. In Proceeding of the 19th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 95--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Marisabel Guevara, Benjamin Lubin, and Benjamin C. Lee. 2014. Strategies for anticipating risk in heterogeneous system design. In Proceeding of the 20th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 154--164.Google ScholarGoogle Scholar
  24. Ramakrishna Gummadi, Ramesh Johari, and Jia Yuan Yu. 2012. Mean field equilibria of multiarmed bandit games. In Proceedings of the 13th ACM Conference on Electronic Commerce (EC). ACM, 655--655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, and Ion Stoica. 2011. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI). USENIX Association, 295--308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Chang-Hong Hsu, Yunqi Zhang, Michael Laurenzano, David Meisner, Thomas Wenisch, Jason Mars, Lingjia Tang, Ronald G. Dreslinski, and others. 2015. Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting. In Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 271--282.Google ScholarGoogle ScholarCross RefCross Ref
  27. Krishnamurthy Iyer, Ramesh Johari, and Mukund Sundararajan. 2011. Mean field equilibria of dynamic auctions with learning. ACM SIGecom Exch. 10, 3 (2011), 10--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Mael Le Treust and Samson Lasaulce. 2010. A repeated game formulation of energy-efficient decentralized power control. IEEE Trans. Wireless Commun. 9, 9 (2010), 2860--2869. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yingmin Li, Benjamin Lee, David Brooks, Zhigang Hu, and Kevin Skadron. 2006. CMP design space exploration subject to physical constraints. In Proceedings of the 12th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 17--28.Google ScholarGoogle ScholarCross RefCross Ref
  30. M. Lichman. 2013. UCI Machine Learning Repository. (2013). http://archive.ics.uci.edu/ml.Google ScholarGoogle Scholar
  31. Minghong Lin, Adam Wierman, Lachlan L. H. Andrew, and Eno Thereska. 2013. Dynamic right-sizing for power-proportional data centers. IEEE/ACM Trans. Netw. 21, 5 (2013), 1378--1391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zhenhua Liu, Adam Wierman, Yuan Chen, Benjamin Razon, and Niangjun Chen. 2013. Data center demand response: Avoiding the coincident peak via workload shifting and local generation. Perf. Eval. 70, 10 (2013), 770--791. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. David Lo, Liqun Cheng, Rama Govindaraju, Luiz André Barroso, and Christos Kozyrakis. 2014. Towards energy proportionality for large-scale latency-critical workloads. In Proceeding of the 41st Annual International Symposium on Computer Architecture (ISCA). IEEE, 301--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. F. Maxwell Harper and Joseph A. Konstan. 2015. The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4 (2015), 19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Robert Meusel, Sebastiano Vigna, Oliver Lehmberg, and Christian Bizer. 2012. Web Data Commons - Hyperlink Graphs. (2012). http://webdatacommons.org/hyperlinkgraph/index.html Online; accessed: 12-29-2016Google ScholarGoogle Scholar
  36. Zwane Mwaikambo, Ashok Raj, Rusty Russell, Joel Schopp, and Srivatsa Vaddagiri. 2004. Linux kernel hotplug CPU support. In Linux Symposium, Vol. 2.Google ScholarGoogle Scholar
  37. Arun Raghavan. 2013. Computational Sprinting: Exceeding Sustainable Power in Thermally Constrained Systems. Ph.D. Dissertation. University of Pennsylvania. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Arun Raghavan, Laurel Emurian, Lei Shao, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2013a. Computational sprinting on a hardware/software testbed. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 155--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Arun Raghavan, Laurel Emurian, Lei Shao, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2013b. Utilizing dark silicon to save energy with computational sprinting. IEEE Micro 33, 5 (2013), 20--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Arun Raghavan, Yixin Luo, Anuj Chandawalla, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2012. Computational sprinting. In Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE Computer Society, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Lei Shao, Arun Raghavan, Laurel Emurian, Marios C. Papaefthymiou, Thomas F. Wenisch, Milo M. K. Martin, and Kevin P. Pipe. 2014. On-chip phase change heat sinks designed for computational sprinting. In Proceedings of the 30th Annual Semiconductor Thermal Measurement and Management Symposium (SEMI-THERM). IEEE, 29--34.Google ScholarGoogle Scholar
  42. Matt Skach, Manish Arora, Chang-Hong Hsu, Qi Li, Dean Tullsen, Lingjia Tang, and Jason Mars. 2015. Thermal time shifting: Leveraging phase change materials to reduce cooling costs in warehouse-scale computers. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA). IEEE, 439--449. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Thannirmalai Somu Muthukaruppan, Anuj Pathania, and Tulika Mitra. 2014. Price theory based power management for heterogeneous multi-cores. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). ACM, 161--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. J. Stamper, A. Niculescu-Mizil, S. Ritter, G. J. Gordon, and K. R. Koedinger. 2010. Algebra I 2006-2007. Challenge data set from KDD Cup 2010 Educational Data Mining Challenge. (2010). http://pslcdatashop.web.cmu.edu/KDDCup/downloads.jsp.Google ScholarGoogle Scholar
  45. Fabien Volle, Suresh V. Garimella, Mark Juds, and others. 2010. Thermal management of a soft starter: Transient thermal impedance model and performance enhancements using phase change materials. IEEE Trans. Power Electron. 25, 6 (2010), 1395--1405.Google ScholarGoogle ScholarCross RefCross Ref
  46. Xiaorui Wang, Ming Chen, Charles Lefurgy, and Tom W. Keller. 2012. Ship: A scalable hierarchical power control architecture for large-scale data centers. IEEE Trans. Parallel Distrib. Syst. 23, 1 (2012), 168--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, Vol. 10. 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Seyed Majid Zahedi and Benjamin C. Lee. 2014. REF: Resource elasticity fairness with sharing incentives for multiprocessors. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 145--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Seyed Majid Zahedi and Benjamin C. Lee. 2015. Sharing incentives and fair division for multiprocessors. IEEE Micro 35, 3 (2015), 92--100.Google ScholarGoogle ScholarCross RefCross Ref
  50. Wenli Zheng and Xiaorui Wang. 2015. Data center sprinting: Enabling computational sprinting at the data center level. In Proceedings of the 35th International Conference on Distributed Computing Systems (ICDCS). IEEE, 175--184.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Computational Sprinting: Architecture, Dynamics, and Strategies

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!