Abstract
Containers, as a lightweight application virtualization technology, have recently gained immense popularity in mainstream cluster management systems like Google Borg and Kubernetes. Prevalently adopted by these systems for task deployments of diverse workloads such as big data, web services, and IoT, they support agile application deployment, environmental consistency, OS distribution portability, application-centric management, and resource isolation. Although most of these systems are mature with advanced features, their optimization strategies are still tailored to the assumption of a static cluster. Elastic compute resources would enable heterogeneous resource management strategies in response to the dynamic business volume for various types of workloads. Hence, we propose a heterogeneous task allocation strategy for cost-efficient container orchestration through resource utilization optimization and elastic instance pricing with three main features. The first one is to support heterogeneous job configurations to optimize the initial placement of containers into existing resources by task packing. The second one is cluster size adjustment to meet the changing workload through autoscaling algorithms. The third one is a rescheduling mechanism to shut down underutilized VM instances for cost saving and reallocate the relevant jobs without losing task progress. We evaluate our approach in terms of cost and performance on the Australian National Cloud Infrastructure (Nectar). Our experiments demonstrate that the proposed strategy could reduce the overall cost by 23% to 32% for different types of cloud workload patterns when compared to the default Kubernetes framework.
- R. Mocevicius. 2015. CoreOS Essentials. Packt Publishing Ltd.Google Scholar
- A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune, and J. Wilkes. 2015. Large scale cluster management at Google with Borg. In Proceedings of the 10th European Conference on Computer Systems. 18.Google Scholar
- K. Hightower, B. Burns, and J. Beda. 2017. Kubernetes: Up and Running: Dive into the Future of Infrastructure. O'Reilly Media.Google Scholar
- M. A. Rodriguez and R. Buyya. 2019. Container‐based cluster orchestration systems: A taxonomy and future directions. Software: Practice and Experience 49, 5 (2019), 698--719.Google Scholar
Cross Ref
- H. D. Karatza. 2004. Scheduling in distributed systems. In Performance Tools and Applications to Networked Systems. Lecture Notes in Computer Science, Vol. 2965. Springer, 336--356.Google Scholar
- G. Copil, D. Moldovan, H. Truong, and S. Dustdar. 2016. rSYBL: A framework for specifying and controlling cloud services elasticity. ACM Transactions on Internet Technology 16, 3 (2016), 18.Google Scholar
Digital Library
- D. Bernstei. 2014. Containers and cloud: From LXC to Docker to Kubernetes. IEEE Cloud Computing 1, 3 (2014), 81--84.Google Scholar
Cross Ref
- V. Medel, O. Rana, J. Á. Bañares, and U. Arronategui. 2016. Modelling performance and resource management in Kubernetes. In Proceedings of the 9th IEEE/ACM International Conference on Utility and Cloud Computing (UCC’16). 257--262.Google Scholar
- N. Naik. 2016. Building a virtual system of systems using Docker swarm in multiple clouds. In Proceedings of the 2nd IEEE International Symposium on Systems Engineering (ISSE’16). 1--3.Google Scholar
Cross Ref
- B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. 2011. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation. 295--308.Google Scholar
- GitHub. 2019. Marathon. Retrieved March 22, 2020 from https://mesosphere.github.io/marathon.Google Scholar
- R. DelValle, G. Rattihalli, A. Beltre, M. Govindaraju, and M. J. Lewis. 2016. Exploring the design space for optimizations with Apache Aurora and Mesos. In Proceedings of the 9th IEEE International Conference on Cloud Computing (CLOUD’16). 537--544.Google Scholar
- J. Guo, Z. Chang, S. Wang, H. Ding, Y. Feng, L. Mao, and Y. Bao. 2019. Who limits the resource efficiency of my datacenter: An analysis of Alibaba datacenter traces. In Proceedings of the ACM International Symposium on Quality of Service (IWQoS’19). 39.Google Scholar
- H. Zhang, H. Ma, G. Fu, X. Yang, Z. Jiang, and Y. Gao. 2016. Container based video surveillance cloud service with fine-grained resource provisioning. In Proceedings of the 9th IEEE International Conference on Cloud Computing (CLOUD’16). 758--765.Google Scholar
- C. Kaewkasi and K. Chuenmuneewong. 2017. Improvement of container scheduling for Docker using ant colony optimization. In Proceedings of the 9th International Conference on Knowledge and Smart Technology (KST’17). 254--259.Google Scholar
- Q. Liu and Z. Yu. 2018. The elasticity and plasticity in semi-containerized co-locating cloud workload: A view from Alibaba Trace. In Proceedings of the ACM Symposium on Cloud Computing (SoCC’18). ACM, New York, NY, 347--360.Google Scholar
- C. Guerrero, I. Lera, and C. Juiz. 2018. Genetic algorithm for multi-objective optimization of container allocation in cloud architecture. Journal of Grid Computing 16, 1 (2018), 113--135.Google Scholar
Digital Library
- S. Kehrer and W. Blochinger. 2018. TOSCA-based container orchestration on Mesos. Computer Science—Research and Development 33, 3--4 (2018), 305--316.Google Scholar
- M. Xu, A. Toosi, and R. Buyya. 2019. iBrownout: An integrated approach for managing energy and brownout in container-based clouds. IEEE Transactions on Sustainable Computing 4, 1 (2019), 53--66.Google Scholar
Cross Ref
- S. Taherizadeh and V. Stankovski. 2018. Dynamic multi-level autoscaling rules for containerized applications. Computer Journal 62, 2 (2018), 174--197.Google Scholar
Cross Ref
- A. Chung, J. W. Park, and G. R. Ganger. 2018. Stratus: Cost-aware container scheduling in the public cloud. In Proceedings of the ACM Symposium on Cloud Computing. 121--134.Google Scholar
- M. A. Rodriguez and R. Buyya. 2018. Containers orchestration with cost-efficient autoscaling in cloud computing environments. arXiv:1812.00300.Google Scholar
- D. N. Jha, S. Garg, P. P. Jayaraman, R. Buyya, Z. Li, and R. Ranjan. 2018. A holistic evaluation of Docker containers for interfering microservices. In Proceedings of the 2018 IEEE International Conference on Services Computing. 33--40.Google Scholar
- J. Son, A. V. Dastjerdi, R. N. Calheiros, and R. Buyya. 2017. SLA-aware and energy-efficient dynamic overbooking in SDN-based cloud data centers. IEEE Transactions on Sustainable Computing 2, 2 (2017), 76--89.Google Scholar
- M. Mao and M. Humphrey. 2011. Auto-scaling to minimize cost and meet application deadlines in cloud workflows. In Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’11). 1--12.Google Scholar
- J. Kang and S. Park. 2003. Algorithms for the variable sized bin packing problem. European Journal of Operational Research 147, 2 (2003), 365--372.Google Scholar
Cross Ref
- Nectar. Home Page. Retrieved March 22, 2020 from https://nectar.org.au/.Google Scholar
- Lakshman and P. Malik. 2010. Cassandra: A decentralized structured storage system. ACM SIGOPS Operating Systems Review 44, 2 (2010), 35--40.Google Scholar
Digital Library
- S. Pickartz, N. Eiling, S. Lankes, L. Razik, and A. Monti. 2016. Migrating LinuX containers using CRIU. In High Performance Computing. Lecture Notes in Computer Science, Vol. 9945. Springer, 674--684.Google Scholar
- Nedelcu, Clément. 2010. Nginx HTTP Server: Adopt Nginx for Your Web Applications to Make the Most of Your Infrastructure and Serve Pages Faster Than Ever. Packt Publishing Ltd.Google Scholar
- M. Chen, W. Li, G. Fortino, Y. Hao, L. Hu, and I. Humar. 2019. A dynamic service migration mechanism in edge cognitive computing. ACM Transactions on Internet Technology 19, 2 (2019) 30.Google Scholar
Digital Library
- Z. Gong, X. Gu, and J. Wilkes. 2010. PRESS: PRedictive elastic resource scaling for cloud systems. In Proceedings of 2010 International Conference on Network and Service Management. 9--16.Google Scholar
- Khan, X. Yan, S. Tao, and N. Anerousis. 2012. Workload characterization and prediction in the cloud: A multiple time series approach. In Proceedings of the 2012 IEEE Network Operations and Management Symposium. 1287--1294.Google Scholar
- V. Medel, O. Rana, J. Á. Bañares, and U. Arronategui. 2016. Adaptive application scheduling under interference in Kubernetes. In Proceedings of the 9th IEEE/ACM International Conference on Utility and Cloud Computing (UCC’16). 426--427.Google Scholar
- C. T. Joseph and K. Chandrasekaran. 2019. Straddling the crevasse: A review of microservice software architecture foundations and recent advancements. Software: Practice and Experience 49, 10 (2019), 1448--1484.Google Scholar
Cross Ref
- U. Paščinsk, J. Trnkoczy, V. Stankovski, M. Cigale, and S. Gec. 2018. QoS-aware orchestration of network intensive software utilities within software defined data centres. Journal of Grid Computing 16, 1 (2018), 85--112.Google Scholar
Cross Ref
- P. Kochovski, P. D. Drobintsev, and V. Stankovski. 2019. Formal quality of service assurances, ranking and verification of cloud deployment options with a probabilistic model checking method. Information and Software Technology 109, 2 (2019), 14--25.Google Scholar
Cross Ref
- C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch. 2012. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the 3rd ACM Symposium on Cloud Computing. 7.Google Scholar
- B. Sharma, V. Chudnovsky, J. L. Hellerstein, R. Rifaat, and C. R. Das. 2011. Modeling and synthesizing task placement constraints in Google compute clusters. In Proceedings of the 2nd ACM Symposium on Cloud Computing. 3.Google Scholar
- C. Pahl and B. Lee. 2015. Containers and clusters for edge cloud architectures—A technology review. In Proceedings of the 3rd IEEE International Conference on Future Internet of Things and Cloud. 379--386.Google Scholar
- B. Burns and D. Oppenheimer. 2016. Design patterns for container-based distributed systems. In Proceedings of the 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’16). 2016.Google Scholar
- J. Yu and R. Buyya. 2005. A taxonomy of scientific workflow systems for grid computing. ACM SIGMOD Record 34, 3 (2005) 44--49.Google Scholar
Digital Library
- M. Xu and R. Buyya. 2019. BrownoutCon: A software system based on brownout and containers for energy-efficient cloud computing. Journal of Systems and Software 155, 5 (2019), 91--103.Google Scholar
Cross Ref
- X. Xu, H. Yu, and X. Pei. 2014. A novel resource scheduling approach in container based clouds. In Proceedings of the 17th IEEE International Conference on Computational Science and Engineering. 257--264.Google Scholar
- L. Yin, J. Luo, and H. Luo. 2018. Tasks scheduling and resource allocation in fog computing based on containers for smart manufacturing. IEEE Transactions on Industrial Informatics 14, 10 (2018), 4712--4721.Google Scholar
Cross Ref
- R. Buyya, R. N. Calheiros, J. Son, A. V. Dastjerdi, and Y. Yoon. 2014. Software-defined cloud computing: Architectural elements and open challenges. In Proceedings of the 3rd IEEE International Conference on Advances in Computing, Communications, and informatics (ICACCI’14). 1--12.Google Scholar
- Z. Zhao, A. Taal, A. Jones, I. Taylor, V. Stankovski, I. G. Vega, and C. de Laat. 2015. A software workbench for interactive, time critical and highly self-adaptive cloud applications (SWITCH). In Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. 1181--1184.Google Scholar
Digital Library
- Z. Zhang, C. Li, Y. Tao, R. Yang, H. Tang, and J. Xu. 2014. Fuxi: A fault-tolerant resource management and job scheduling system at Internet scale. Proceedings of the VLDB Endowment 7, 13 (2014), 1393--1404.Google Scholar
Digital Library
- L. Qi. 2019. Maximizing CPU Resource Utilization on Alibaba's Servers. Retrieved March 22, 2020 from https://102.alibaba.com/detail/?id=61.Google Scholar
- C. Delimitrou, D. Sanchez, and C. Kozyrakis. 2015. Tarcil: Reconciling scheduling speed and quality in large shared clusters. In Proceedings of the 6th ACM Symposium on Cloud Computing. 97--110.Google Scholar
- S. Shastri and D. Irwin. 2017. HotSpot: Automated server hopping in cloud spot markets. In Proceedings of the 8th ACM Symposium on Cloud Computing. 493--505.Google Scholar
Index Terms
A Cost-Efficient Container Orchestration Strategy in Kubernetes-Based Cloud Computing Infrastructures with Heterogeneous Resources
Recommendations
The Prospects for Multi-Cloud Deployment of SaaS Applications with Container Orchestration Platforms
Middleware Doctoral Symposium'16: Proceedings of the Doctoral Symposium of the 17th International Middleware ConferenceRecent years have seen an increased adoption of container technology for software deployment and lightweight virtualization. More recently, container orchestration systems provide a platform for container deployment and management of cluster resources.
...
Secure container orchestration in the cloud: policies and implementation
SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied ComputingIn the late few years, cloud computing has been moving towards becoming the predominant infrastructure paradigm because of its scale economy advantages. Consequently, a great deal of sensitive and valuable information has begun to inhabit the cloud ...
TOSCA-based container orchestration on Mesos
Container virtualization evolved into a key technology for deployment automation in line with the DevOps paradigm. Whereas container management systems facilitate the deployment of cloud applications by employing container-based artifacts, parts of the ...






Comments