skip to main content
research-article

A Cost-Efficient Container Orchestration Strategy in Kubernetes-Based Cloud Computing Infrastructures with Heterogeneous Resources

Published:17 April 2020Publication History
Skip Abstract Section

Abstract

Containers, as a lightweight application virtualization technology, have recently gained immense popularity in mainstream cluster management systems like Google Borg and Kubernetes. Prevalently adopted by these systems for task deployments of diverse workloads such as big data, web services, and IoT, they support agile application deployment, environmental consistency, OS distribution portability, application-centric management, and resource isolation. Although most of these systems are mature with advanced features, their optimization strategies are still tailored to the assumption of a static cluster. Elastic compute resources would enable heterogeneous resource management strategies in response to the dynamic business volume for various types of workloads. Hence, we propose a heterogeneous task allocation strategy for cost-efficient container orchestration through resource utilization optimization and elastic instance pricing with three main features. The first one is to support heterogeneous job configurations to optimize the initial placement of containers into existing resources by task packing. The second one is cluster size adjustment to meet the changing workload through autoscaling algorithms. The third one is a rescheduling mechanism to shut down underutilized VM instances for cost saving and reallocate the relevant jobs without losing task progress. We evaluate our approach in terms of cost and performance on the Australian National Cloud Infrastructure (Nectar). Our experiments demonstrate that the proposed strategy could reduce the overall cost by 23% to 32% for different types of cloud workload patterns when compared to the default Kubernetes framework.

References

  1. R. Mocevicius. 2015. CoreOS Essentials. Packt Publishing Ltd.Google ScholarGoogle Scholar
  2. A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune, and J. Wilkes. 2015. Large scale cluster management at Google with Borg. In Proceedings of the 10th European Conference on Computer Systems. 18.Google ScholarGoogle Scholar
  3. K. Hightower, B. Burns, and J. Beda. 2017. Kubernetes: Up and Running: Dive into the Future of Infrastructure. O'Reilly Media.Google ScholarGoogle Scholar
  4. M. A. Rodriguez and R. Buyya. 2019. Container‐based cluster orchestration systems: A taxonomy and future directions. Software: Practice and Experience 49, 5 (2019), 698--719.Google ScholarGoogle ScholarCross RefCross Ref
  5. H. D. Karatza. 2004. Scheduling in distributed systems. In Performance Tools and Applications to Networked Systems. Lecture Notes in Computer Science, Vol. 2965. Springer, 336--356.Google ScholarGoogle Scholar
  6. G. Copil, D. Moldovan, H. Truong, and S. Dustdar. 2016. rSYBL: A framework for specifying and controlling cloud services elasticity. ACM Transactions on Internet Technology 16, 3 (2016), 18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Bernstei. 2014. Containers and cloud: From LXC to Docker to Kubernetes. IEEE Cloud Computing 1, 3 (2014), 81--84.Google ScholarGoogle ScholarCross RefCross Ref
  8. V. Medel, O. Rana, J. Á. Bañares, and U. Arronategui. 2016. Modelling performance and resource management in Kubernetes. In Proceedings of the 9th IEEE/ACM International Conference on Utility and Cloud Computing (UCC’16). 257--262.Google ScholarGoogle Scholar
  9. N. Naik. 2016. Building a virtual system of systems using Docker swarm in multiple clouds. In Proceedings of the 2nd IEEE International Symposium on Systems Engineering (ISSE’16). 1--3.Google ScholarGoogle ScholarCross RefCross Ref
  10. B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. 2011. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation. 295--308.Google ScholarGoogle Scholar
  11. GitHub. 2019. Marathon. Retrieved March 22, 2020 from https://mesosphere.github.io/marathon.Google ScholarGoogle Scholar
  12. R. DelValle, G. Rattihalli, A. Beltre, M. Govindaraju, and M. J. Lewis. 2016. Exploring the design space for optimizations with Apache Aurora and Mesos. In Proceedings of the 9th IEEE International Conference on Cloud Computing (CLOUD’16). 537--544.Google ScholarGoogle Scholar
  13. J. Guo, Z. Chang, S. Wang, H. Ding, Y. Feng, L. Mao, and Y. Bao. 2019. Who limits the resource efficiency of my datacenter: An analysis of Alibaba datacenter traces. In Proceedings of the ACM International Symposium on Quality of Service (IWQoS’19). 39.Google ScholarGoogle Scholar
  14. H. Zhang, H. Ma, G. Fu, X. Yang, Z. Jiang, and Y. Gao. 2016. Container based video surveillance cloud service with fine-grained resource provisioning. In Proceedings of the 9th IEEE International Conference on Cloud Computing (CLOUD’16). 758--765.Google ScholarGoogle Scholar
  15. C. Kaewkasi and K. Chuenmuneewong. 2017. Improvement of container scheduling for Docker using ant colony optimization. In Proceedings of the 9th International Conference on Knowledge and Smart Technology (KST’17). 254--259.Google ScholarGoogle Scholar
  16. Q. Liu and Z. Yu. 2018. The elasticity and plasticity in semi-containerized co-locating cloud workload: A view from Alibaba Trace. In Proceedings of the ACM Symposium on Cloud Computing (SoCC’18). ACM, New York, NY, 347--360.Google ScholarGoogle Scholar
  17. C. Guerrero, I. Lera, and C. Juiz. 2018. Genetic algorithm for multi-objective optimization of container allocation in cloud architecture. Journal of Grid Computing 16, 1 (2018), 113--135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Kehrer and W. Blochinger. 2018. TOSCA-based container orchestration on Mesos. Computer Science—Research and Development 33, 3--4 (2018), 305--316.Google ScholarGoogle Scholar
  19. M. Xu, A. Toosi, and R. Buyya. 2019. iBrownout: An integrated approach for managing energy and brownout in container-based clouds. IEEE Transactions on Sustainable Computing 4, 1 (2019), 53--66.Google ScholarGoogle ScholarCross RefCross Ref
  20. S. Taherizadeh and V. Stankovski. 2018. Dynamic multi-level autoscaling rules for containerized applications. Computer Journal 62, 2 (2018), 174--197.Google ScholarGoogle ScholarCross RefCross Ref
  21. A. Chung, J. W. Park, and G. R. Ganger. 2018. Stratus: Cost-aware container scheduling in the public cloud. In Proceedings of the ACM Symposium on Cloud Computing. 121--134.Google ScholarGoogle Scholar
  22. M. A. Rodriguez and R. Buyya. 2018. Containers orchestration with cost-efficient autoscaling in cloud computing environments. arXiv:1812.00300.Google ScholarGoogle Scholar
  23. D. N. Jha, S. Garg, P. P. Jayaraman, R. Buyya, Z. Li, and R. Ranjan. 2018. A holistic evaluation of Docker containers for interfering microservices. In Proceedings of the 2018 IEEE International Conference on Services Computing. 33--40.Google ScholarGoogle Scholar
  24. J. Son, A. V. Dastjerdi, R. N. Calheiros, and R. Buyya. 2017. SLA-aware and energy-efficient dynamic overbooking in SDN-based cloud data centers. IEEE Transactions on Sustainable Computing 2, 2 (2017), 76--89.Google ScholarGoogle Scholar
  25. M. Mao and M. Humphrey. 2011. Auto-scaling to minimize cost and meet application deadlines in cloud workflows. In Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’11). 1--12.Google ScholarGoogle Scholar
  26. J. Kang and S. Park. 2003. Algorithms for the variable sized bin packing problem. European Journal of Operational Research 147, 2 (2003), 365--372.Google ScholarGoogle ScholarCross RefCross Ref
  27. Nectar. Home Page. Retrieved March 22, 2020 from https://nectar.org.au/.Google ScholarGoogle Scholar
  28. Lakshman and P. Malik. 2010. Cassandra: A decentralized structured storage system. ACM SIGOPS Operating Systems Review 44, 2 (2010), 35--40.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Pickartz, N. Eiling, S. Lankes, L. Razik, and A. Monti. 2016. Migrating LinuX containers using CRIU. In High Performance Computing. Lecture Notes in Computer Science, Vol. 9945. Springer, 674--684.Google ScholarGoogle Scholar
  30. Nedelcu, Clément. 2010. Nginx HTTP Server: Adopt Nginx for Your Web Applications to Make the Most of Your Infrastructure and Serve Pages Faster Than Ever. Packt Publishing Ltd.Google ScholarGoogle Scholar
  31. M. Chen, W. Li, G. Fortino, Y. Hao, L. Hu, and I. Humar. 2019. A dynamic service migration mechanism in edge cognitive computing. ACM Transactions on Internet Technology 19, 2 (2019) 30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Z. Gong, X. Gu, and J. Wilkes. 2010. PRESS: PRedictive elastic resource scaling for cloud systems. In Proceedings of 2010 International Conference on Network and Service Management. 9--16.Google ScholarGoogle Scholar
  33. Khan, X. Yan, S. Tao, and N. Anerousis. 2012. Workload characterization and prediction in the cloud: A multiple time series approach. In Proceedings of the 2012 IEEE Network Operations and Management Symposium. 1287--1294.Google ScholarGoogle Scholar
  34. V. Medel, O. Rana, J. Á. Bañares, and U. Arronategui. 2016. Adaptive application scheduling under interference in Kubernetes. In Proceedings of the 9th IEEE/ACM International Conference on Utility and Cloud Computing (UCC’16). 426--427.Google ScholarGoogle Scholar
  35. C. T. Joseph and K. Chandrasekaran. 2019. Straddling the crevasse: A review of microservice software architecture foundations and recent advancements. Software: Practice and Experience 49, 10 (2019), 1448--1484.Google ScholarGoogle ScholarCross RefCross Ref
  36. U. Paščinsk, J. Trnkoczy, V. Stankovski, M. Cigale, and S. Gec. 2018. QoS-aware orchestration of network intensive software utilities within software defined data centres. Journal of Grid Computing 16, 1 (2018), 85--112.Google ScholarGoogle ScholarCross RefCross Ref
  37. P. Kochovski, P. D. Drobintsev, and V. Stankovski. 2019. Formal quality of service assurances, ranking and verification of cloud deployment options with a probabilistic model checking method. Information and Software Technology 109, 2 (2019), 14--25.Google ScholarGoogle ScholarCross RefCross Ref
  38. C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch. 2012. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the 3rd ACM Symposium on Cloud Computing. 7.Google ScholarGoogle Scholar
  39. B. Sharma, V. Chudnovsky, J. L. Hellerstein, R. Rifaat, and C. R. Das. 2011. Modeling and synthesizing task placement constraints in Google compute clusters. In Proceedings of the 2nd ACM Symposium on Cloud Computing. 3.Google ScholarGoogle Scholar
  40. C. Pahl and B. Lee. 2015. Containers and clusters for edge cloud architectures—A technology review. In Proceedings of the 3rd IEEE International Conference on Future Internet of Things and Cloud. 379--386.Google ScholarGoogle Scholar
  41. B. Burns and D. Oppenheimer. 2016. Design patterns for container-based distributed systems. In Proceedings of the 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’16). 2016.Google ScholarGoogle Scholar
  42. J. Yu and R. Buyya. 2005. A taxonomy of scientific workflow systems for grid computing. ACM SIGMOD Record 34, 3 (2005) 44--49.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. M. Xu and R. Buyya. 2019. BrownoutCon: A software system based on brownout and containers for energy-efficient cloud computing. Journal of Systems and Software 155, 5 (2019), 91--103.Google ScholarGoogle ScholarCross RefCross Ref
  44. X. Xu, H. Yu, and X. Pei. 2014. A novel resource scheduling approach in container based clouds. In Proceedings of the 17th IEEE International Conference on Computational Science and Engineering. 257--264.Google ScholarGoogle Scholar
  45. L. Yin, J. Luo, and H. Luo. 2018. Tasks scheduling and resource allocation in fog computing based on containers for smart manufacturing. IEEE Transactions on Industrial Informatics 14, 10 (2018), 4712--4721.Google ScholarGoogle ScholarCross RefCross Ref
  46. R. Buyya, R. N. Calheiros, J. Son, A. V. Dastjerdi, and Y. Yoon. 2014. Software-defined cloud computing: Architectural elements and open challenges. In Proceedings of the 3rd IEEE International Conference on Advances in Computing, Communications, and informatics (ICACCI’14). 1--12.Google ScholarGoogle Scholar
  47. Z. Zhao, A. Taal, A. Jones, I. Taylor, V. Stankovski, I. G. Vega, and C. de Laat. 2015. A software workbench for interactive, time critical and highly self-adaptive cloud applications (SWITCH). In Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. 1181--1184.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Z. Zhang, C. Li, Y. Tao, R. Yang, H. Tang, and J. Xu. 2014. Fuxi: A fault-tolerant resource management and job scheduling system at Internet scale. Proceedings of the VLDB Endowment 7, 13 (2014), 1393--1404.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. L. Qi. 2019. Maximizing CPU Resource Utilization on Alibaba's Servers. Retrieved March 22, 2020 from https://102.alibaba.com/detail/?id=61.Google ScholarGoogle Scholar
  50. C. Delimitrou, D. Sanchez, and C. Kozyrakis. 2015. Tarcil: Reconciling scheduling speed and quality in large shared clusters. In Proceedings of the 6th ACM Symposium on Cloud Computing. 97--110.Google ScholarGoogle Scholar
  51. S. Shastri and D. Irwin. 2017. HotSpot: Automated server hopping in cloud spot markets. In Proceedings of the 8th ACM Symposium on Cloud Computing. 493--505.Google ScholarGoogle Scholar

Index Terms

  1. A Cost-Efficient Container Orchestration Strategy in Kubernetes-Based Cloud Computing Infrastructures with Heterogeneous Resources

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Internet Technology
            ACM Transactions on Internet Technology  Volume 20, Issue 2
            Special Section on Emotions in Conflictual Social Interactions and Regular Papers
            May 2020
            256 pages
            ISSN:1533-5399
            EISSN:1557-6051
            DOI:10.1145/3386441
            • Editor:
            • Ling Liu
            Issue’s Table of Contents

            Copyright © 2020 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 17 April 2020
            • Accepted: 1 January 2020
            • Revised: 1 November 2019
            • Received: 1 August 2019
            Published in toit Volume 20, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!