Abstract
Computational sprinting is a class of mechanisms that boost performance but dissipate additional power. We describe a sprinting architecture in which many, independent chip multiprocessors share a power supply and sprints are constrained by the chips' thermal limits and the rack's power limits. Moreover, we present the computational sprinting game, a multi-agent perspective on managing sprints. Strategic agents decide whether to sprint based on application phases and system conditions. The game produces an equilibrium that improves task throughput for data analytics workloads by 4-6× over prior greedy heuristics and performs within 90% of an upper bound on throughput from a globally optimized policy.
- US census data (1990) data set. https://archive.ics.uci.edu/ml/datasets/US+Census+Data (1990).Google Scholar
- Movielens dataset. http://grouplens.org/datasets/movielens/.Google Scholar
- Web data commons: Hyperlink graphs. http://webdatacommons.org/hyperlinkgraph/index.html.Google Scholar
- Dynamic thermal management for high-performance microprocessors. In Proceedings of the 7th IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 171--182. IEEE Computer Society, 2001.Google Scholar
- Control-theoretic techniques and thermal-RC modeling for accurate and localized dynamic thermal management. In Proceedings of the 8th IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 17--28, 2002.Google Scholar
Cross Ref
- S. Adlakha and R. Johari. Mean field equilibrium in dynamic games with strategic complementarities. Operations Research, 61 (4): 971--989, 2013.Google Scholar
Cross Ref
- S. Adlakha, R. Johari, G. Y. Weintraub, and A. Goldsmith. On oblivious equilibrium in large population stochastic games. In Proceedings of the 49th IEEE Conference on Decision and Control (CDC), pages 3117--3124. IEEE, 2010.Google Scholar
Cross Ref
- S. Adlakha, R. Johari, and G. Y. Weintraub. Equilibria of dynamic games with many players: Existence, approximation, and market structure. Journal of Economic Theory, 2013.Google Scholar
- Allen-Bradley. Bulletin 1489 UL489 circuit breakers. http://literature.rockwellautomation.com/idc/groups/literature/documents/td/1489-td001_-en-p.pdf.Google Scholar
- Ametek. Selection and sizing of batteries for UPS backup. http://www.solidstatecontrolsinc.com/download/selection-and-sizing-batteries-tech-paper.pdf.Google Scholar
- H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron. Towards predictable datacenter networks. In Proceedings of the ACM SIGCOMM Conference (SIGCOMM), pages 242--253. ACM, 2011.Google Scholar
Digital Library
- L. A. Barroso, J. Clidaras, and U. Hölzle. The datacenter as a computer: An introduction to the design of warehouse-scale machines. Synthesis Lectures on Computer Architecture, 8 (3): 1--154, 2013.Google Scholar
Digital Library
- A. Beloglazov, J. Abawajy, and R. Buyya. Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future Generation Computer Systems, 28 (5): 755--768, 2012.Google Scholar
Digital Library
- J. L. Berral, Í. Goiri, R. Nou, F. Julià, J. Guitart, R. Gavaldà, and J. Torres. Towards energy-aware scheduling in data centers using machine learning. In Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking, pages 215--224. ACM, 2010.Google Scholar
Digital Library
- A. A. Bhattacharya, D. Culler, A. Kansal, S. Govindan, and S. Sankar. The need for speed and stability in data center power capping. Sustainable Computing: Informatics and Systems, 3 (3): 183--193, 2013.Google Scholar
Cross Ref
- J. S. Chase, D. C. Anderson, P. N. Thakar, A. M. Vahdat, and R. P. Doyle. Managing energy and server resources in hosting centers. In Proceedings of the 18th Symposium on Operating Systems Principles (SOSP), pages 103--116. ACM, 2001.Google Scholar
Digital Library
- G. Chen, W. He, J. Liu, S. Nath, L. Rigas, L. Xiao, and F. Zhao. Energy-aware server provisioning and load dispatching for connection-intensive internet services. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 337--350. USENIX Association, 2008.Google Scholar
Digital Library
- X. Fan, W.-D. Weber, and L. A. Barroso. Power provisioning for a warehouse-sized computer. In Proceedings of the 34th Annual International Symposium on Computer Architecture (ISCA), pages 13--23. ACM, 2007.Google Scholar
Digital Library
- X. Fu, X. Wang, and C. Lefurgy. How much power oversubscription is safe and allowed in data centers. In Proceedings of the 8th ACM International Conference on Autonomic Computing (ICAC), pages 21--30. ACM, 2011.Google Scholar
Digital Library
- A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant resource fairness: Fair allocation of multiple resource types. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI).Google Scholar
- Í. Goiri, T. D. Nguyen, R. Bianchini, and Í. G. Presa. Coolair: Temperature-and variation-aware management for free-cooled datacenters. In Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 253--265. ACM, 2015.Google Scholar
- S. Govindan, A. Sivasubramaniam, and B. Urgaonkar. Benefits and limitations of tapping into stored energy for datacenters. In Proceeding of the 38th Annual International Symposium on Computer Architecture (ISCA), pages 341--351. IEEE, 2011.Google Scholar
Digital Library
- S. Govindan, D. Wang, A. Sivasubramaniam, and B. Urgaonkar. Leveraging stored energy for handling power emergencies in aggressively provisioned datacenters. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 75--86. ACM, 2012.Google Scholar
Digital Library
- M. Guevara, B. Lubin, and B. C. Lee. Navigating heterogeneous processors with market mechanisms. In Proceeding of the 19th IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 95--106. IEEE, 2013.Google Scholar
Digital Library
- M. Guevara, B. Lubin, and B. C. Lee. Strategies for anticipating risk in heterogeneous system design. In Proceeding of the 20th IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 154--164. IEEE, 2014.Google Scholar
Cross Ref
- R. Gummadi, R. Johari, and J. Y. Yu. Mean field equilibria of multiarmed bandit games. In Proceedings of the 13th ACM Conference on Electronic Commerce (EC), pages 655--655. ACM, 2012.Google Scholar
Digital Library
- B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI), pages 295--308. USENIX Association, 2011.Google Scholar
- C.-H. Hsu, Y. Zhang, M. Laurenzano, D. Meisner, T. Wenisch, J. Mars, L. Tang, R. G. Dreslinski, et al. Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting. In Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 271--282. IEEE, 2015.Google Scholar
Cross Ref
- K. Iyer, R. Johari, and M. Sundararajan. Mean field equilibria of dynamic auctions with learning. ACM SIGecom Exchanges, 10 (3): 10--14, 2011.Google Scholar
Digital Library
- M. Le Treust and S. Lasaulce. A repeated game formulation of energy-efficient decentralized power control. Wireless Communications, IEEE Transactions on, 9 (9): 2860--2869, 2010.Google Scholar
- Y. Li, B. Lee, D. Brooks, Z. Hu, and K. Skadron. CMP design space exploration subject to physical constraints. In Proceedings of the 12th IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 17--28. IEEE, 2006.Google Scholar
Cross Ref
- M. Lin, A. Wierman, L. L. Andrew, and E. Thereska. Dynamic right-sizing for power-proportional data centers. IEEE/ACM Transactions on Networking (TON), 21 (5): 1378--1391, 2013.Google Scholar
- Z. Liu, A. Wierman, Y. Chen, B. Razon, and N. Chen. Data center demand response: Avoiding the coincident peak via workload shifting and local generation. Performance Evaluation, 70 (10): 770--791, 2013.Google Scholar
Digital Library
- D. Lo, L. Cheng, R. Govindaraju, L. A. Barroso, and C. Kozyrakis. Towards energy proportionality for large-scale latency-critical workloads. In Proceeding of the 41st Annual International Symposium on Computer Architecture (ISCA), pages 301--312. IEEE, 2014.Google Scholar
Digital Library
- A. Raghavan. Computational sprinting: Exceeding sustainable power in thermally constrained systems. PhD thesis, University of Pennsylvania, 2013.Google Scholar
- A. Raghavan, Y. Luo, A. Chandawalla, M. Papaefthymiou, K. P. Pipe, T. F. Wenisch, and M. M. K. Martin. Computational sprinting. In Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 1--12. IEEE Computer Society, 2012.Google Scholar
Digital Library
- A. Raghavan, L. Emurian, L. Shao, M. Papaefthymiou, K. P. Pipe, T. F. Wenisch, and M. M. Martin. Computational sprinting on a hardware/software testbed. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 155--166. ACM, 2013.Google Scholar
Digital Library
- A. Raghavan, L. Emurian, L. Shao, M. Papaefthymiou, K. P. Pipe, T. F. Wenisch, and M. M. Martin. Utilizing dark silicon to save energy with computational sprinting. IEEE Micro, 33 (5): 20--28, 2013.Google Scholar
Cross Ref
- A. Raj. CPU hotplug support in Linux™ kernel. URL https://www.kernel.org/doc/Documentation/cpu-hotplug.txt.Google Scholar
- L. Shao, A. Raghavan, L. Emurian, M. C. Papaefthymiou, T. F. Wenisch, M. M. Martin, and K. P. Pipe. On-chip phase change heat sinks designed for computational sprinting. In Proceedings of the 30th Annual Semiconductor Thermal Measurement and Management Symposium (SEMI-THERM), pages 29--34. IEEE, 2014.Google Scholar
Cross Ref
- M. Skach, M. Arora, C.-H. Hsu, Q. Li, D. Tullsen, L. Tang, and J. Mars. Thermal time shifting: Leveraging phase change materials to reduce cooling costs in warehouse-scale computers. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA), pages 439--449. IEEE, 2015.Google Scholar
Digital Library
- T. Somu Muthukaruppan, A. Pathania, and T. Mitra. Price theory based power management for heterogeneous multi-cores. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '14, pages 161--176. ACM, 2014.Google Scholar
Digital Library
- J. Stamper, A. Niculescu-Mizil, S. Ritter, G. Gordon, and K. Koedinger. Algebra I 2006-2007. Challenge data set from KDD Cup 2010 educational data mining challenge. http://pslcdatashop.web.cmu.edu/KDDCup/downloads.jsp.Google Scholar
- F. Volle, S. V. Garimella, M. Juds, et al. Thermal management of a soft starter: Transient thermal impedance model and performance enhancements using phase change materials. Power Electronics, IEEE Transactions on, 25 (6): 1395--1405, 2010.Google Scholar
- X. Wang, M. Chen, C. Lefurgy, and T. W. Keller. Ship: A scalable hierarchical power control architecture for large-scale data centers. Parallel and Distributed Systems, IEEE Transactions on, 23 (1): 168--176, 2012.Google Scholar
Digital Library
- M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, volume 10, page 10, 2010.Google Scholar
Digital Library
- S. Zahedi and B. Lee. Sharing incentives and fair division for multiprocessors. IEEE Micro, 35 (3): 92--100, 2015.Google Scholar
Digital Library
- S. M. Zahedi and B. C. Lee. REF: Resource elasticity fairness with sharing incentives for multiprocessors. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 145--160. ACM, 2014.Google Scholar
Digital Library
- W. Zheng and X. Wang. Data center sprinting: Enabling computational sprinting at the data center level. In Proceedings of the 35th International Conference on Distributed Computing Systems (ICDCS), pages 175--184. IEEE, 2015.Google Scholar
Cross Ref
Index Terms
The Computational Sprinting Game
Recommendations
The Computational Sprinting Game
ASPLOS'16Computational sprinting is a class of mechanisms that boost performance but dissipate additional power. We describe a sprinting architecture in which many, independent chip multiprocessors share a power supply and sprints are constrained by the chips' ...
The Computational Sprinting Game
ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating SystemsComputational sprinting is a class of mechanisms that boost performance but dissipate additional power. We describe a sprinting architecture in which many, independent chip multiprocessors share a power supply and sprints are constrained by the chips' ...
NoC-Sprinting: Interconnect for Fine-Grained Sprinting in the Dark Silicon Era
DAC '14: Proceedings of the 51st Annual Design Automation ConferenceThe rise of utilization wall limits the number of transistors that can be powered on in a single chip and results in a large region of dark silicon. While such phenomenon has led to disruptive innovation in computation, little work has been done for the ...







Comments