Abstract
In 2013, U.S. data centers accounted for 2.2% of the country’s total electricity consumption, a figure that is projected to increase rapidly over the next decade. Many important data center workloads in cloud computing are interactive, and they demand strict levels of quality-of-service (QoS) to meet user expectations, making it challenging to optimize power consumption along with increasing performance demands.
This article introduces Hipster, a technique that combines heuristics and reinforcement learning to improve resource efficiency in cloud systems. Hipster explores heterogeneous multi-cores and dynamic voltage and frequency scaling for reducing energy consumption while managing the QoS of the latency-critical workloads. To improve data center utilization and make best usage of the available resources, Hipster can dynamically assign remaining cores to batch workloads without violating the QoS constraints for the latency-critical workloads. We perform experiments using a 64-bit ARM big.LITTLE platform and show that, compared to prior work, Hipster improves the QoS guarantee for Web-Search from 80% to 96%, and for Memcached from 92% to 99%, while reducing the energy consumption by up to 18%. Hipster is also effective in learning and adapting automatically to specific requirements of new incoming workloads just enough to meet the QoS and optimize resource consumption.
- ARM. 2016. ARM Juno Power Registers. Retrieved from ARMRegistershttps://github.com/ARM-software/devlib/blob/master/src/readenergy/readenergy.c.Google Scholar
- ARM. 2016. ARM Juno R1. Retrieved from https://goo.gl/EcamOa.Google Scholar
- ARM. 2016. SYS_POW_SYS Register. Retrieved from https://goo.gl/fmTTQi.Google Scholar
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the ACM Special Interest Group on Measurement and Evaluation (SIGMETRICS’12). Google Scholar
Digital Library
- Luiz Andre Barroso, Jimmy Clidaras, and Urs Hölzle. 2013. The datacenter as a computer: An introduction to the design of warehouse-scale machines, second edition. Synth. Lect. Comput. Arch. 8, 3 (7 2013), 1--154.Google Scholar
- Luiz André Barroso, Jeffrey Dean, and Urs Hölzle. 2003. Web search for a planet: The Google cluster architecture. IEEE Micro 23, 2 (March 2003), 22--28. Google Scholar
Digital Library
- David Bernstein. 2014. Containers and cloud: From LXC to docker to kubernetes. IEEE Cloud Comput. 1, 3 (9 2014), 81--84.Google Scholar
- Ozlem Bilgir, Margaret Martonosi, and Qiang Wu. 2011. Exploring the potential of CMP core count management on data center energy savings. In Proceedings of the 3rd Workshop on Energy Efficient Design (WEED’11).Google Scholar
- Dominik Brodowski. 2017. CPU frequency and voltage scaling code in the Linux kernel. (February 2017). https://goo.gl/8nxMrb.Google Scholar
- Marcus Carvalho, Walfredo Cirne, Franciso Brasileiro, and John Wilkes. 2014. Long-term SLOs for reclaimed cloud computing resources. In Proceedings of the ACM Symposium on Cloud Computing (SOCC’14). 20:1--20:13. http://dl.acm.org/citation.cfm?id=2670999 Google Scholar
Digital Library
- Nagabhushan Chitlur, Ganapati Srinivasa, Scott Hahn, P. K. Gupta, Dheeraj Reddy, David Koufaty, Paul Brett, Abirami Prabhakaran, Li Zhao, Nelson Ijih, Suchit Subhaschandra, Sabina Grover, Xiaowei Jiang, and Ravi Iyer. 2012. QuickIA: Exploring heterogeneous architectures on real prototypes. In Proceedings of the IEEE International Symposium on High-Performance Comp Architecture. IEEE, 1--8. Google Scholar
Digital Library
- Jason Cong and Bo Yuan. 2012. Energy-efficient scheduling on heterogeneous multi-core architectures. In Proceedings of the 2012 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED’12). ACM Press, New York, NY, 345. Google Scholar
Digital Library
- CortexA53, ARM. 2016. ARM ® Cortex ® A53 MPCore Processor Technical Reference Manual.Google Scholar
- CortexA57, ARM. 2016. ARM ® Cortex ® A57 MPCore Processor Revision: r1p0 Technical Reference Manual.Google Scholar
- Jeffrey Dean and Luiz Andre Barroso. 2013. The tail at scale. Commun. ACM 56, 2 (2 2013), 74. Google Scholar
Digital Library
- Pierre Delforge and Josh Whitney. 2017. Data center efficiency assessment. Natural Resources Defense Council (NRDC) (2017).Google Scholar
- Christina Delimitrou and Christos Kozyrakis. 2013. Paragon: QoS-aware scheduling for heterogeneous datacenters. SIGPLAN Not. 48, 4 (March 2013), 77--88. Google Scholar
Digital Library
- Christina Delimitrou and Christos Kozyrakis. 2013. QoS-aware scheduling in heterogeneous datacenters with paragon. ACM Trans. Comput. Syst. 31, 4, Article 12 (Dec. 2013), 34 pages. Google Scholar
Digital Library
- Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: Resource-efficient and QoS-aware cluster management. ACM SIGARCH Comput. Arch. News 42, 1 (4 2014), 127--127. Google Scholar
Digital Library
- Christina Delimitrou, Daniel Sanchez, and Christos Kozyrakis. 2015. Tarcil: Reconciling scheduling speed and quality in large shared clusters. In Proceedings of the 6th ACM Symposium on Cloud Computing (SOCC’15). ACM, New York, NY, 97--110. Google Scholar
Digital Library
- Elasticsearch. 2016. Elasticsearch. Retrieved from https://github.com/elastic/elasticsearchGoogle Scholar
- Schurman Eric and Brutlag Jake. 2009. The user and business impact of server delays, additional bytes, and HTTP chunking in web search. Velocity Web Performance and Operations Conference. https://goo.gl/rMfGeg.Google Scholar
- Faban. 2016. Faban. Retrieved from http://faban.org/.Google Scholar
- Facebook. 2016. Facebook is opening a new wind-powered data center in Texas. Retrieved from http://goo.gl/dKVnSB.Google Scholar
- Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi. 2012. Clearing the clouds: A study of emerging scale-out workloads on modern hardware. SIGPLAN Not. 47, 4 (Mar. 2012), 37--48. Google Scholar
Digital Library
- Waclaw Godycki, Christopher Torng, Ivan Bukreyev, Alyssa Apsel, and Christopher Batten. 2014. Enabling realistic fine-grain voltage scaling with reconfigurable power distribution networks. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47). IEEE Computer Society, Washington, DC, 381--393. Google Scholar
Digital Library
- Ionel Gog, Malte Schwarzkopf, Adam Gleave, Robert N. M. Watson, and Steven Hand. 2016. Firmament: Fast, centralized cluster scheduling at scale. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16). USENIX Association, GA, 99--115. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/gog Google Scholar
Digital Library
- Marisabel Guevara, Benjamin Lubin, and Benjamin C. Lee. 2013. Navigating heterogeneous processors with market mechanisms. In Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’13). 95--106. Google Scholar
Digital Library
- Matthew Halpern, Yuhao Zhu, and Vijay Janapa Reddi. 2016. Mobile CPU’s rise to power: Quantifying the impact of generational mobile CPU design trends on performance, energy, and user satisfaction. In Proceedings of the IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’16).Google Scholar
Cross Ref
- John L. Henning. 2006. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput. Arch. News 34, 4 (9 2006), 1--17. Google Scholar
Digital Library
- Tibor Horvath, Tarek Abdelzaher, Kevin Skadron, and Xue Liu. 2007. Dynamic voltage scaling in multitier web servers with end-to-end delay control. IEEE Trans. Comput. 56, 4 (4 2007), 444--458. Google Scholar
Digital Library
- IBM. 2007. IBM Research, Technical Paper Search, Model-Based and Model-Free Approaches to Autonomic Resource Allocation (Search Reports).Google Scholar
- Engin Ipek, Onur Mutlu, José F. Martínez, and Rich Caruana. 2008. Self-optimizing memory controllers: A reinforcement learning approach. In International Symposium on Computer Architecture. Google Scholar
Digital Library
- Vijay Janapa Reddi, Benjamin C. Lee, Trishul Chilimbi, and Kushagra Vaid. 2010. Web search using mobile cores: Quantifying and mitigating the price of efficiency. SIGARCH Comput. Archit. News 38, 3 (June 2010), 314--325. Google Scholar
Digital Library
- Harshad Kasture, Davide B. Bartolini, Nathan Beckmann, and Daniel Sanchez. 2015. Rubik: Fast analytical power management for latency-critical systems. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). ACM, New York, NY, 598--610. Google Scholar
Digital Library
- Jacob Leverich, Matteo Monchiero, Vanish Talwar, Parthasarathy Ranganathan, and Christos Kozyrakis. 2009. Power management of datacenter workloads using per-core power gating. IEEE Comput. Arch. Lett. 8, 2 (2 2009), 48--51. Google Scholar
Digital Library
- Jialin Li, Naveen Kr. Sharma, Dan R. K. Ports, and Steven D. Gribble. 2014. Tales of the tail. In Proceedings of the ACM Symposium on Cloud Computing (SOCC’14).Google Scholar
- Yang Li, Di Wang, Saugata Ghose, Jie Liu, Sriram Govindan, Sean James, Eric Peterson, John Siegler, Rachata Ausavarungnirun, and Onur Mutlu. 2016. SizeCap: Efficiently handling power surges in fuel cell powered data centers. In Proceedings of the IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’16).Google Scholar
Cross Ref
- David Lo, Liqun Cheng, Rama Govindaraju, Luiz Andre Barroso, and Christos Kozyrakis. 2014. Towards energy proportionality for large-scale latency-critical workloads. ACM SIGARCH Comput. Arch. News 42, 3 (10 2014), 301--312. Google Scholar
Digital Library
- David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: Improving resource efficiency at scale. SIGARCH Comput. Archit. News 43, 3 (2015). ACM. 450–462. Google Scholar
Digital Library
- David Lo and Christos Kozyrakis. 2014. Dynamic management of turbomode in modern multi-core chips. In Proceedings of the IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’14).Google Scholar
Cross Ref
- Niti Madan, Alper Buyuktosunoglu, Pradip Bose, and Murali Annavaram. 2011. A case for guarded power gating for multi-core processors. In Proceedings of the IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’11). Google Scholar
Digital Library
- Jason Mars and Lingjia Tang. 2013. Whare-map: Heterogeneity in ”homogeneous” warehouse-scale computers. ACM SIGARCH Comput. Arch. News 41, 3 (2013). ACM. 619–630. Google Scholar
Digital Library
- Jason Mars, Lingjia Tang, and Robert Hundt. 2011. Heterogeneity in ’homogeneous’ warehouse-scale computers: A performance opportunity. IEEE Comput. Arch. Lett. 10, 2 (2 2011), 29--32. Google Scholar
Digital Library
- Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. 2011. Bubble-Up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-44’11). ACM Press, New York, NY, 248. Google Scholar
Digital Library
- David Meisner, Christopher M. Sadler, Luiz Andr Barroso, Wolf-Dietrich Weber, and Thomas F. Wenisch. 2011. Power management of online data-intensive services. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA’11), Vol. 39. ACM Press, New York, NY, 319. Google Scholar
Digital Library
- Applied Micro. 2016. Applied Micro XGene 2. Retrieved from http://goo.gl/XA04r1.Google Scholar
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2 2015), 529--533.Google Scholar
- Ripal Nathuji, Aman Kansal, and Alireza Ghaffarkhah. 2010. Q-clouds. In Proceedings of the 5th European Conference on Computer Systems (EuroSys’10). ACM Press, New York, NY, 237.Google Scholar
- Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, and Venkateshwaran Venkataramani. 2013. Scaling memcache at facebook. In USENIX Conference on Networked Systems Design and Implementation. Google Scholar
Digital Library
- Rajiv Nishtala, Daniel Mosse, and Vinicius Petrucci. 2013. Energy-aware thread co-location in heterogeneous multicore processors. In Proceedings of the International Conference on Embedded Software (EMSOFT’13). 1--9. Google Scholar
Digital Library
- Dejan Novaković, Nedeljko Vasić, Stanko Novaković, Dejan Kostić, and Ricardo Bianchini. 2013. DeepDive: Transparently identifying and managing performance interference in virtualized environments. In USENIX Conference on Annual Technical Conference. Google Scholar
Digital Library
- Richard Pattis. 2016. Complexity of Python Operations. Retrieved from https://www.ics.uci.edu/.Google Scholar
- Linux project Perf. 2016. Perf: Linux profiling with performance counters. Retrieved from https://perf.wiki.kernel.org/.Google Scholar
- Vinicius Petrucci, Michael A. Laurenzano, John Doherty, Yunqi Zhang, Daniel Mosse, Jason Mars, and Lingjia Tang. 2015. Octopus-Man: QoS-driven task management for heterogeneous multicores in warehouse-scale computers. In Proceedings of the IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’15).Google Scholar
Cross Ref
- George Prekas, Mia Primorac, Adam Belay, Christos Kozyrakis, and Edouard Bugnion. 2015. Energy proportionality and workload consolidation for latency-critical applications. In Proceedings of the 6th ACM Symposium on Cloud Computing (SOCC’15). ACM, New York, NY, 342--355. Google Scholar
Digital Library
- Martin L. Puterman. 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming (1st ed.). John Wiley 8 Sons, Inc., New York, NY. Google Scholar
Digital Library
- Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray, Michael Haselman, Scott Hauck, Stephen Heil, Amir Hormati, Joo-Young Kim, Sitaram Lanka, James Larus, Eric Peterson, Simon Pope, Aaron Smith, Jason Thong, Phillip Yi Xiao, and Doug Burger. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. In Proceedings of the Annual International Symposium on Computer Architecture (ISCA’14). Google Scholar
Digital Library
- Wu Qiang. 2016. Making Facebook’s software infrastructure more energy efficient with Autoscale. Retrieved from goo.gl/vJi1kf.Google Scholar
- Gang Ren, Eric Tune, Tipp Moseley, Yixin Shi, Silvius Rus, and Robert Hundt. 2010. Google-wide profiling: A continuous profiling infrastructure for data centers. IEEE Micro 30, 4 (7 2010), 65--79. Google Scholar
Digital Library
- John Russell. 2017. ARM Waving: Attention, Deployments, and Development.Google Scholar
- Suton. R. S and A. G Barto. 1998. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA. Google Scholar
Digital Library
- Gerald Tesauro. 2005. Online resource allocation using decompositional reinforcement learning. In Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference. 886--891. Google Scholar
Digital Library
- G. Tesauro, N. K. Jong, R. Das, and M. N. Bennani. 2006. A hybrid reinforcement learning approach to autonomic resource allocation. In Proceedings of the 2006 IEEE International Conference on Autonomic Computing (ICAC’06). IEEE Computer Society, Washington, DC, 65--73. Google Scholar
Digital Library
- Balajee Vamanan, Hamza Bin Sohail, Jahangir Hasan, and T. N. Vijaykumar. 2015. TimeTrader: Exploiting latency tail to save datacenter energy for online search. In Proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-48’15). Google Scholar
Digital Library
- Violaine Villebonnet, Georges Da Costa, Laurent Lefèvre, Jean-Marc Pierson, and Patricia Stolf. 2016. Energy aware dynamic provisioning for heterogeneous data centers. In Proceedings of the 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD’16). 206--213.Google Scholar
Cross Ref
- Daniel Wong and Murali Annavaram. 2012. KnightShift: Scaling the energy proportionality wall through server-level heterogeneity. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 119--130. Google Scholar
Digital Library
- Wonyoung Wonyoung Kim, Meeta S. Gupta, Gu-Yeon Wei, and David Brooks. 2008. System level analysis of fast, per-core DVFS using on-chip switching regulators. In Proceedings of the IEEE 14th International Symposium on High Performance Computer Architecture. IEEE, 123--134.Google Scholar
- Qiang Wu, Qingyuan Deng, Lakshmi Ganesh, Chang-Hong Hsu, Yun Jin, Sanjeev Kumar, Bin Li, Justin Meza, and Yee Jiun Song. 2016. Dynamo: Facebook’s data center-wide power management system. In Proceedings of the Annual International Symposium on Computer Architecture (ISCA’16). Google Scholar
Digital Library
- Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. 2013. Bubble-flux. ACM SIGARCH Computer Architecture News (2013).Google Scholar
- Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. 2013. Bubble-flux: Precise online qos management for increased utilization in warehouse scale computers. In Proceedings of the Annual International Symposium on Computer Architecture (ISCA’13). Google Scholar
Digital Library
- Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, and John Wilkes. 2013. CPI 2. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys’13). ACM Press, New York, NY, 379.Google Scholar
Index Terms
The Hipster Approach for Improving Cloud System Efficiency
Recommendations
Improving Resource Efficiency at Scale with Heracles
User-facing, latency-sensitive services, such as websearch, underutilize their computing resources during daily periods of low traffic. Reusing those resources for other tasks is rarely done in production services since the contention for shared ...
Understanding Performance Interference of I/O Workload in Virtualized Cloud Environments
CLOUD '10: Proceedings of the 2010 IEEE 3rd International Conference on Cloud ComputingServer virtualization offers the ability to slice large, underutilized physical servers into smaller, parallel virtual machines (VMs), enabling diverse applications to run in isolated environments on a shared hardware platform. Effective management of ...
Cloud White: Detecting and Estimating QoS Degradation of Latency-Critical Workloads in the Public Cloud
AbstractThe increasing popularity of cloud computing has forced cloud providers to build economies of scale to meet the growing demand. Nowadays, data-centers include thousands of physical machines, each hosting many virtual machines (VMs), ...
Highlights- Estimating performance degradation of VMs is a key challenge in public clouds.
- ...






Comments