skip to main content
research-article
Public Access

Online Optimization in Cloud Resource Provisioning: Predictions, Regrets, and Algorithms

Authors Info & Claims
Published:26 March 2019Publication History
Skip Abstract Section

Abstract

Due to mainstream adoption of cloud computing and its rapidly increasing usage of energy, the efficient management of cloud computing resources has become an important issue. A key challenge in managing the resources lies in the volatility of their demand. While there have been a wide variety of online algorithms (e.g. Receding Horizon Control, Online Balanced Descent) designed, it is hard for cloud operators to pick the right algorithm. In particular, these algorithms vary greatly on their usage of predictions and performance guarantees. This paper aims at studying an automatic algorithm selection scheme in real time. To do this, we empirically study the prediction errors from real-world cloud computing traces. Results show that prediction errors are distinct from different prediction algorithms, across virtual machines, and over the time horizon. Based on these observations, we propose a simple prediction error model and prove upper bounds on the dynamic regret of several online algorithms. We then apply the empirical and theoretical results to create a simple online meta-algorithm that chooses the best algorithm on the fly. Numerical simulations demonstrate that the performance of the designed policy is close to that of the best algorithm in hindsight.

References

  1. Yahya Al-Dhuraibi, Fawaz Paraiso, Nabil Djarallah, and Philippe Merle. 2018. Elasticity in cloud computing: state of the art and research challenges. IEEE Transactions on Services Computing , Vol. 11, 2 (2018), 430--447.Google ScholarGoogle ScholarCross RefCross Ref
  2. Ahmad Al-Shishtawy and Vladimir Vlassov. 2013. Elastman: elasticity manager for elastic key-value stores in the cloud. In Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference. ACM, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Lachlan Andrew, Siddharth Barman, Katrina Ligett, Minghong Lin, Adam Meyerson, Alan Roytman, and Adam Wierman. 2013. A tale of two metrics: Simultaneous bounds on competitiveness and regret. In Conference on Learning Theory. 741--763.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Lachlan LH Andrew, Minghong Lin, and Adam Wierman. 2010. Optimality, fairness, and robustness in speed scaling designs. In ACM SIGMETRICS Performance Evaluation Review, Vol. 38. ACM, 37--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Danilo Ardagna, Michele Ciavotta, and Riccardo Lancellotti. 2014. A receding horizon approach for the runtime management of iaas cloud systems. In Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2014 16th International Symposium on. IEEE, 445--452.Google ScholarGoogle ScholarCross RefCross Ref
  6. Adnan Ashraf, Benjamin Byholm, and Ivan Porres. 2012. CRAMP: Cost-efficient resource allocation for multiple web applications with proactive scaling. In Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on. IEEE, 581--586. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Lee Badger, Tim Grance, Robert Patt-Corner, Jeff Voas, et almbox. 2012. Cloud computing synopsis and recommendations. NIST special publication , Vol. 800 (2012), 146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jeff Barr. 2018. New AWS Auto Scaling -- Unified Scaling For Your Cloud Applications. https://aws.amazon.com/blogs/aws/aws-auto-scaling-unified-scaling-for-your-cloud-applications/.Google ScholarGoogle Scholar
  9. Omar Besbes, Yonatan Gur, and Assaf Zeevi. 2015. Non-stationary stochastic optimization. Operations research , Vol. 63, 5 (2015), 1227--1244.Google ScholarGoogle Scholar
  10. Peter Bodik, Michael Paul Armbrust, Kevin Canini, Armando Fox, Michael Jordan, and David A Patterson. 2008. A case for adaptive datacenters to conserve energy and improve reliability. University of California at Berkeley, Tech. Rep. UCB/EECS-2008--127 (2008).Google ScholarGoogle Scholar
  11. Allan Borodin and Ran El-Yaniv. 2005. Online computation and competitive analysis .cambridge university press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A Borodin, N Linial, and M Saks. 1987. An optimal online algorithm for metrical task systems. In Proceedings of the nineteenth annual ACM symposium on Theory of computing. ACM, 373--382. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Allan Borodin, Nathan Linial, and Michael E Saks. 1992. An optimal on-line algorithm for metrical task system. Journal of the ACM (JACM) , Vol. 39, 4 (1992), 745--763. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Mark Brinda and Michael Heric. 2017. The Changing Faces of the Cloud. Bain Company (2017).Google ScholarGoogle Scholar
  15. Eduardo F Camacho and Carlos Bordons Alba. 2013. Model predictive control .Springer Science & Business Media.Google ScholarGoogle Scholar
  16. Eddy Caron, Frédéric Desprez, and Adrian Muresan. 2010. Forecasting for Cloud computing on-demand resources based on pattern matching. (2010).Google ScholarGoogle Scholar
  17. Niangjun Chen, Anish Agarwal, Adam Wierman, Siddharth Barman, and Lachlan LH Andrew. 2015a. Online convex optimization using predictions. In ACM SIGMETRICS Performance Evaluation Review , Vol. 43. ACM, 191--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Niangjun Chen, Joshua Comden, Zhenhua Liu, Anshul Gandhi, and Adam Wierman. 2016. Using predictions in online optimization: Looking forward with an eye on the past. In Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science. ACM, 193--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Niangjun Chen, Gautam Goel, and Adam Wierman. 2018. Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent. In Proceedings of the 31st Conference On Learning Theory (Proceedings of Machine Learning Research), , Sébastien Bubeck, Vianney Perchet, and Philippe Rigollet (Eds.), Vol. 75. PMLR, 1574--1594. http://proceedings.mlr.press/v75/chen18b.htmlGoogle ScholarGoogle Scholar
  20. Niangjun Chen, Xiaoqi Ren, Shaolei Ren, and Adam Wierman. 2015b. Greening multi-tenant data center demand response. Performance Evaluation , Vol. 91 (2015), 229--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, and Andrew Warfield. 2005. Live migration of virtual machines. In Proceedings of the 2nd Conference on Symposium on Networked Systems Design & Implementation-Volume 2. USENIX Association, 273--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Joshua Comden, Sijie Yao, Niangjun Chen, Haipeng Xing, and Zhenhua Liu. 2019. Online Optimization in Cloud Resource Provisioning: Predictions, Regrets, and Algorithms: Virtual Machine ID Dataset. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 153--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ariel da Silva Dias, Luis HV Nakamura, Julio C Estrella, Regina HC Santana, and Marcos J Santana. 2014. Providing IaaS resources automatically through prediction and monitoring approaches. In Computers and Communication (ISCC), 2014 IEEE Symposium on. IEEE, 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  25. Peter J Denning and Jeffrey P Buzen. 1978. The operational analysis of queueing network models. ACM Computing Surveys (CSUR) , Vol. 10, 3 (1978), 225--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Xiaobo Fan, Wolf-Dietrich Weber, and Luiz Andre Barroso. 2007. Power provisioning for a warehouse-sized computer. In ACM SIGARCH computer architecture news , Vol. 35. ACM, 13--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Carlos E Garcia, David M Prett, and Manfred Morari. 1989. Model predictive control: theory and practice-a survey. Automatica , Vol. 25, 3 (1989), 335--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zhenhuan Gong, Xiaohui Gu, and John Wilkes. 2010. PRESS: PRedictive Elastic ReSource Scaling for cloud systems. CNSM , Vol. 10 (2010), 9--16.Google ScholarGoogle Scholar
  29. Eric C Hall and Rebecca M Willett. 2015. Online convex optimization in dynamic environments. IEEE Journal of Selected Topics in Signal Processing , Vol. 9, 4 (2015), 647--662.Google ScholarGoogle ScholarCross RefCross Ref
  30. Jinhui Huang, Chunlin Li, and Jie Yu. 2012. Resource prediction based on double exponential smoothing in cloud computing. In Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on. IEEE, 2056--2060.Google ScholarGoogle Scholar
  31. Sadeka Islam, Jacky Keung, Kevin Lee, and Anna Liu. 2012. Empirical prediction models for adaptive resource provisioning in the cloud. Future Generation Computer Systems , Vol. 28, 1 (2012), 155--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ali Jadbabaie, Alexander Rakhlin, Shahin Shahrampour, and Karthik Sridharan. 2015. Online optimization: Competing with dynamic comparators. In Artificial Intelligence and Statistics. 398--406.Google ScholarGoogle Scholar
  33. Evangelia Kalyvianaki, Themistoklis Charalambous, and Steven Hand. 2009. Self-adaptive and self-configured CPU resource provisioning for virtualized servers using Kalman filters. In Proceedings of the 6th international conference on Autonomic computing. ACM, 117--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Chuanqi Kan. 2016. DoCloud: An elastic cloud platform for Web applications based on Docker. In Advanced Communication Technology (ICACT), 2016 18th International Conference on . IEEE, 478--483.Google ScholarGoogle Scholar
  35. Sunirmal Khatua, Anirban Ghosh, and Nandini Mukherjee. 2010. Optimizing the utilization of virtual resources in Cloud environment. In Virtual Environments Human-Computer Interfaces and Measurement Systems (VECIMS), 2010 IEEE International Conference on. IEEE, 82--87.Google ScholarGoogle ScholarCross RefCross Ref
  36. Minghong Lin, Zhenhua Liu, Adam Wierman, and Lachlan LH Andrew. 2012. Online algorithms for geographical load balancing. In Green Computing Conference (IGCC), 2012 International. IEEE, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Minghong Lin, Adam Wierman, Lachlan LH Andrew, and Eno Thereska. {n. d.}. Dynamic right-sizing for power-proportional data centers. In 2011 Proceedings IEEE INFOCOM .Google ScholarGoogle Scholar
  38. Minghong Lin, Adam Wierman, Lachlan LH Andrew, and Eno Thereska. 2013. Dynamic right-sizing for power-proportional data centers. IEEE/ACM Transactions on Networking (TON) , Vol. 21, 5 (2013), 1378--1391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Nick Littlestone and Manfred K Warmuth. 1994. The weighted majority algorithm. Information and computation , Vol. 108, 2 (1994), 212--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Paul Marshall, Kate Keahey, and Tim Freeman. 2010. Elastic site: Using clouds to elastically extend site resources. In Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing . IEEE Computer Society, 43--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Haibo Mi, Huaimin Wang, Gang Yin, Yangfan Zhou, Dianxi Shi, and Lin Yuan. 2010. Online self-reconfiguration with performance guarantee for energy-efficient large-scale cloud computing data centers. In Services Computing (SCC), 2010 IEEE International Conference on. IEEE, 514--521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Hanna Michalska and David Q Mayne. 1993. Robust receding horizon control of constrained nonlinear systems. IEEE transactions on automatic control , Vol. 38, 11 (1993), 1623--1633.Google ScholarGoogle Scholar
  43. Laura R Moore, Kathryn Bean, and Tariq Ellahi. 2013. A coordinated reactive and predictive approach to cloud elasticity. (2013).Google ScholarGoogle Scholar
  44. Nilabja Roy, Abhishek Dubey, and Aniruddha Gokhale. 2011. Efficient autoscaling in the cloud using predictive models for workload forecasting. In Cloud Computing (CLOUD), 2011 IEEE International Conference on. IEEE, 500--507. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Virag Shah and Gustavo de Veciana. 2014. Performance evaluation and asymptotics for content delivery networks. In INFOCOM, 2014 Proceedings IEEE. IEEE, 2607--2615.Google ScholarGoogle ScholarCross RefCross Ref
  46. Shahin Shahrampour and Ali Jadbabaie. 2018. Distributed online optimization in dynamic environments using mirror descent. IEEE Trans. Automat. Control , Vol. 63, 3 (2018), 714--725.Google ScholarGoogle ScholarCross RefCross Ref
  47. Yue Tan and Cathy H Xia. 2015. An adaptive learning approach for efficient resource provisioning in cloud services. ACM Sigmetrics Performance Evaluation Review , Vol. 42, 4 (2015), 3--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Eno Thereska, Austin Donnelly, and Dushyanth Narayanan. 2009. Sierra: a power-proportional, distributed storage system. Microsoft Research Ltd., Tech. Rep. MSR-TR-2009 , Vol. 153 (2009).Google ScholarGoogle Scholar
  49. Lixi Wang, Jing Xu, Ming Zhao, and José Fortes. 2011a. Adaptive virtual resource management with fuzzy model predictive control. In Proceedings of the 8th ACM international conference on Autonomic computing. ACM, 191--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Lixi Wang, Jing Xu, Ming Zhao, Yicheng Tu, and Jose AB Fortes. 2011b. Fuzzy modeling based resource management for virtualized database systems. In Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), 2011 IEEE 19th International Symposium on. IEEE, 32--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Adam Wierman, Lachlan LH Andrew, and Ao Tang. 2009. Power-aware speed scaling in processor sharing systems. IEEE INFOCOM, 2009 (2009), 2007--2015.Google ScholarGoogle ScholarCross RefCross Ref
  52. Shaoquan Zhang, Longbo Huang, Minghua Chen, and Xin Liu. 2017a. Proactive Serving Decreases User Delay Exponentially: The Light-Tailed Service Time Case. IEEE/ACM Trans. Netw. , Vol. 25, 2 (April 2017), 708--723. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Xiaoxi Zhang, Chuan Wu, Zongpeng Li, and Francis CM Lau. 2017b. Proactive vnf provisioning with multi-timescale cloud resources: Fusing online learning and online optimization. In IEEE INFOCOM 2017-IEEE Conference on Computer Communications. IEEE, 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  54. Martin Zinkevich. 2003. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning (ICML-03) . 928--936. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Online Optimization in Cloud Resource Provisioning: Predictions, Regrets, and Algorithms

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!