Abstract
Due to mainstream adoption of cloud computing and its rapidly increasing usage of energy, the efficient management of cloud computing resources has become an important issue. A key challenge in managing the resources lies in the volatility of their demand. While there have been a wide variety of online algorithms (e.g. Receding Horizon Control, Online Balanced Descent) designed, it is hard for cloud operators to pick the right algorithm. In particular, these algorithms vary greatly on their usage of predictions and performance guarantees. This paper aims at studying an automatic algorithm selection scheme in real time. To do this, we empirically study the prediction errors from real-world cloud computing traces. Results show that prediction errors are distinct from different prediction algorithms, across virtual machines, and over the time horizon. Based on these observations, we propose a simple prediction error model and prove upper bounds on the dynamic regret of several online algorithms. We then apply the empirical and theoretical results to create a simple online meta-algorithm that chooses the best algorithm on the fly. Numerical simulations demonstrate that the performance of the designed policy is close to that of the best algorithm in hindsight.
- Yahya Al-Dhuraibi, Fawaz Paraiso, Nabil Djarallah, and Philippe Merle. 2018. Elasticity in cloud computing: state of the art and research challenges. IEEE Transactions on Services Computing , Vol. 11, 2 (2018), 430--447.Google Scholar
Cross Ref
- Ahmad Al-Shishtawy and Vladimir Vlassov. 2013. Elastman: elasticity manager for elastic key-value stores in the cloud. In Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference. ACM, 7. Google Scholar
Digital Library
- Lachlan Andrew, Siddharth Barman, Katrina Ligett, Minghong Lin, Adam Meyerson, Alan Roytman, and Adam Wierman. 2013. A tale of two metrics: Simultaneous bounds on competitiveness and regret. In Conference on Learning Theory. 741--763.Google Scholar
Digital Library
- Lachlan LH Andrew, Minghong Lin, and Adam Wierman. 2010. Optimality, fairness, and robustness in speed scaling designs. In ACM SIGMETRICS Performance Evaluation Review, Vol. 38. ACM, 37--48. Google Scholar
Digital Library
- Danilo Ardagna, Michele Ciavotta, and Riccardo Lancellotti. 2014. A receding horizon approach for the runtime management of iaas cloud systems. In Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2014 16th International Symposium on. IEEE, 445--452.Google Scholar
Cross Ref
- Adnan Ashraf, Benjamin Byholm, and Ivan Porres. 2012. CRAMP: Cost-efficient resource allocation for multiple web applications with proactive scaling. In Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on. IEEE, 581--586. Google Scholar
Digital Library
- Lee Badger, Tim Grance, Robert Patt-Corner, Jeff Voas, et almbox. 2012. Cloud computing synopsis and recommendations. NIST special publication , Vol. 800 (2012), 146. Google Scholar
Digital Library
- Jeff Barr. 2018. New AWS Auto Scaling -- Unified Scaling For Your Cloud Applications. https://aws.amazon.com/blogs/aws/aws-auto-scaling-unified-scaling-for-your-cloud-applications/.Google Scholar
- Omar Besbes, Yonatan Gur, and Assaf Zeevi. 2015. Non-stationary stochastic optimization. Operations research , Vol. 63, 5 (2015), 1227--1244.Google Scholar
- Peter Bodik, Michael Paul Armbrust, Kevin Canini, Armando Fox, Michael Jordan, and David A Patterson. 2008. A case for adaptive datacenters to conserve energy and improve reliability. University of California at Berkeley, Tech. Rep. UCB/EECS-2008--127 (2008).Google Scholar
- Allan Borodin and Ran El-Yaniv. 2005. Online computation and competitive analysis .cambridge university press. Google Scholar
Digital Library
- A Borodin, N Linial, and M Saks. 1987. An optimal online algorithm for metrical task systems. In Proceedings of the nineteenth annual ACM symposium on Theory of computing. ACM, 373--382. Google Scholar
Digital Library
- Allan Borodin, Nathan Linial, and Michael E Saks. 1992. An optimal on-line algorithm for metrical task system. Journal of the ACM (JACM) , Vol. 39, 4 (1992), 745--763. Google Scholar
Digital Library
- Mark Brinda and Michael Heric. 2017. The Changing Faces of the Cloud. Bain Company (2017).Google Scholar
- Eduardo F Camacho and Carlos Bordons Alba. 2013. Model predictive control .Springer Science & Business Media.Google Scholar
- Eddy Caron, Frédéric Desprez, and Adrian Muresan. 2010. Forecasting for Cloud computing on-demand resources based on pattern matching. (2010).Google Scholar
- Niangjun Chen, Anish Agarwal, Adam Wierman, Siddharth Barman, and Lachlan LH Andrew. 2015a. Online convex optimization using predictions. In ACM SIGMETRICS Performance Evaluation Review , Vol. 43. ACM, 191--204. Google Scholar
Digital Library
- Niangjun Chen, Joshua Comden, Zhenhua Liu, Anshul Gandhi, and Adam Wierman. 2016. Using predictions in online optimization: Looking forward with an eye on the past. In Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science. ACM, 193--206. Google Scholar
Digital Library
- Niangjun Chen, Gautam Goel, and Adam Wierman. 2018. Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent. In Proceedings of the 31st Conference On Learning Theory (Proceedings of Machine Learning Research), , Sébastien Bubeck, Vianney Perchet, and Philippe Rigollet (Eds.), Vol. 75. PMLR, 1574--1594. http://proceedings.mlr.press/v75/chen18b.htmlGoogle Scholar
- Niangjun Chen, Xiaoqi Ren, Shaolei Ren, and Adam Wierman. 2015b. Greening multi-tenant data center demand response. Performance Evaluation , Vol. 91 (2015), 229--254. Google Scholar
Digital Library
- Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, and Andrew Warfield. 2005. Live migration of virtual machines. In Proceedings of the 2nd Conference on Symposium on Networked Systems Design & Implementation-Volume 2. USENIX Association, 273--286. Google Scholar
Digital Library
- Joshua Comden, Sijie Yao, Niangjun Chen, Haipeng Xing, and Zhenhua Liu. 2019. Online Optimization in Cloud Resource Provisioning: Predictions, Regrets, and Algorithms: Virtual Machine ID Dataset. Google Scholar
Digital Library
- Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 153--167. Google Scholar
Digital Library
- Ariel da Silva Dias, Luis HV Nakamura, Julio C Estrella, Regina HC Santana, and Marcos J Santana. 2014. Providing IaaS resources automatically through prediction and monitoring approaches. In Computers and Communication (ISCC), 2014 IEEE Symposium on. IEEE, 1--7.Google Scholar
Cross Ref
- Peter J Denning and Jeffrey P Buzen. 1978. The operational analysis of queueing network models. ACM Computing Surveys (CSUR) , Vol. 10, 3 (1978), 225--261. Google Scholar
Digital Library
- Xiaobo Fan, Wolf-Dietrich Weber, and Luiz Andre Barroso. 2007. Power provisioning for a warehouse-sized computer. In ACM SIGARCH computer architecture news , Vol. 35. ACM, 13--23. Google Scholar
Digital Library
- Carlos E Garcia, David M Prett, and Manfred Morari. 1989. Model predictive control: theory and practice-a survey. Automatica , Vol. 25, 3 (1989), 335--348. Google Scholar
Digital Library
- Zhenhuan Gong, Xiaohui Gu, and John Wilkes. 2010. PRESS: PRedictive Elastic ReSource Scaling for cloud systems. CNSM , Vol. 10 (2010), 9--16.Google Scholar
- Eric C Hall and Rebecca M Willett. 2015. Online convex optimization in dynamic environments. IEEE Journal of Selected Topics in Signal Processing , Vol. 9, 4 (2015), 647--662.Google Scholar
Cross Ref
- Jinhui Huang, Chunlin Li, and Jie Yu. 2012. Resource prediction based on double exponential smoothing in cloud computing. In Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on. IEEE, 2056--2060.Google Scholar
- Sadeka Islam, Jacky Keung, Kevin Lee, and Anna Liu. 2012. Empirical prediction models for adaptive resource provisioning in the cloud. Future Generation Computer Systems , Vol. 28, 1 (2012), 155--162. Google Scholar
Digital Library
- Ali Jadbabaie, Alexander Rakhlin, Shahin Shahrampour, and Karthik Sridharan. 2015. Online optimization: Competing with dynamic comparators. In Artificial Intelligence and Statistics. 398--406.Google Scholar
- Evangelia Kalyvianaki, Themistoklis Charalambous, and Steven Hand. 2009. Self-adaptive and self-configured CPU resource provisioning for virtualized servers using Kalman filters. In Proceedings of the 6th international conference on Autonomic computing. ACM, 117--126. Google Scholar
Digital Library
- Chuanqi Kan. 2016. DoCloud: An elastic cloud platform for Web applications based on Docker. In Advanced Communication Technology (ICACT), 2016 18th International Conference on . IEEE, 478--483.Google Scholar
- Sunirmal Khatua, Anirban Ghosh, and Nandini Mukherjee. 2010. Optimizing the utilization of virtual resources in Cloud environment. In Virtual Environments Human-Computer Interfaces and Measurement Systems (VECIMS), 2010 IEEE International Conference on. IEEE, 82--87.Google Scholar
Cross Ref
- Minghong Lin, Zhenhua Liu, Adam Wierman, and Lachlan LH Andrew. 2012. Online algorithms for geographical load balancing. In Green Computing Conference (IGCC), 2012 International. IEEE, 1--10. Google Scholar
Digital Library
- Minghong Lin, Adam Wierman, Lachlan LH Andrew, and Eno Thereska. {n. d.}. Dynamic right-sizing for power-proportional data centers. In 2011 Proceedings IEEE INFOCOM .Google Scholar
- Minghong Lin, Adam Wierman, Lachlan LH Andrew, and Eno Thereska. 2013. Dynamic right-sizing for power-proportional data centers. IEEE/ACM Transactions on Networking (TON) , Vol. 21, 5 (2013), 1378--1391. Google Scholar
Digital Library
- Nick Littlestone and Manfred K Warmuth. 1994. The weighted majority algorithm. Information and computation , Vol. 108, 2 (1994), 212--261. Google Scholar
Digital Library
- Paul Marshall, Kate Keahey, and Tim Freeman. 2010. Elastic site: Using clouds to elastically extend site resources. In Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing . IEEE Computer Society, 43--52. Google Scholar
Digital Library
- Haibo Mi, Huaimin Wang, Gang Yin, Yangfan Zhou, Dianxi Shi, and Lin Yuan. 2010. Online self-reconfiguration with performance guarantee for energy-efficient large-scale cloud computing data centers. In Services Computing (SCC), 2010 IEEE International Conference on. IEEE, 514--521. Google Scholar
Digital Library
- Hanna Michalska and David Q Mayne. 1993. Robust receding horizon control of constrained nonlinear systems. IEEE transactions on automatic control , Vol. 38, 11 (1993), 1623--1633.Google Scholar
- Laura R Moore, Kathryn Bean, and Tariq Ellahi. 2013. A coordinated reactive and predictive approach to cloud elasticity. (2013).Google Scholar
- Nilabja Roy, Abhishek Dubey, and Aniruddha Gokhale. 2011. Efficient autoscaling in the cloud using predictive models for workload forecasting. In Cloud Computing (CLOUD), 2011 IEEE International Conference on. IEEE, 500--507. Google Scholar
Digital Library
- Virag Shah and Gustavo de Veciana. 2014. Performance evaluation and asymptotics for content delivery networks. In INFOCOM, 2014 Proceedings IEEE. IEEE, 2607--2615.Google Scholar
Cross Ref
- Shahin Shahrampour and Ali Jadbabaie. 2018. Distributed online optimization in dynamic environments using mirror descent. IEEE Trans. Automat. Control , Vol. 63, 3 (2018), 714--725.Google Scholar
Cross Ref
- Yue Tan and Cathy H Xia. 2015. An adaptive learning approach for efficient resource provisioning in cloud services. ACM Sigmetrics Performance Evaluation Review , Vol. 42, 4 (2015), 3--11. Google Scholar
Digital Library
- Eno Thereska, Austin Donnelly, and Dushyanth Narayanan. 2009. Sierra: a power-proportional, distributed storage system. Microsoft Research Ltd., Tech. Rep. MSR-TR-2009 , Vol. 153 (2009).Google Scholar
- Lixi Wang, Jing Xu, Ming Zhao, and José Fortes. 2011a. Adaptive virtual resource management with fuzzy model predictive control. In Proceedings of the 8th ACM international conference on Autonomic computing. ACM, 191--192. Google Scholar
Digital Library
- Lixi Wang, Jing Xu, Ming Zhao, Yicheng Tu, and Jose AB Fortes. 2011b. Fuzzy modeling based resource management for virtualized database systems. In Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), 2011 IEEE 19th International Symposium on. IEEE, 32--42. Google Scholar
Digital Library
- Adam Wierman, Lachlan LH Andrew, and Ao Tang. 2009. Power-aware speed scaling in processor sharing systems. IEEE INFOCOM, 2009 (2009), 2007--2015.Google Scholar
Cross Ref
- Shaoquan Zhang, Longbo Huang, Minghua Chen, and Xin Liu. 2017a. Proactive Serving Decreases User Delay Exponentially: The Light-Tailed Service Time Case. IEEE/ACM Trans. Netw. , Vol. 25, 2 (April 2017), 708--723. Google Scholar
Digital Library
- Xiaoxi Zhang, Chuan Wu, Zongpeng Li, and Francis CM Lau. 2017b. Proactive vnf provisioning with multi-timescale cloud resources: Fusing online learning and online optimization. In IEEE INFOCOM 2017-IEEE Conference on Computer Communications. IEEE, 1--9.Google Scholar
Cross Ref
- Martin Zinkevich. 2003. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning (ICML-03) . 928--936. Google Scholar
Digital Library
Index Terms
Online Optimization in Cloud Resource Provisioning: Predictions, Regrets, and Algorithms
Recommendations
Online Optimization in Cloud Resource Provisioning: Predictions, Regrets, and Algorithms
SIGMETRICS '19: Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer SystemsSeveral different control methods are used in practice or have been proposed to cost-effectively provision IT resources. Due to the dependency of many control methods on having accurate predictions of the future to make good provisioning decisions, ...
Online Optimization in Cloud Resource Provisioning: Predictions, Regrets, and Algorithms
Several different control methods are used in practice or have been proposed to cost-effectively provision IT resources. Due to the dependency of many control methods on having accurate predictions of the future to make good provisioning decisions, ...
Efficient resource allocation for optimizing objectives of cloud users, IaaS provider and SaaS provider in cloud environment
The cloud architecture is usually composed of several XaaS layers--including Software as a Service (SaaS), Platform as a Service (PaaS) and Infrastructure as a Service (IaaS). The paper studies efficient resource allocation to optimize objectives of ...






Comments