Abstract
Hypervisor-based virtualization solutions reveal good security and isolation, while container-based solutions make applications and workloads more portable and distributed in an effective, standardized and repeatable way. Therefore, nested virtualization based computing environments (e.g., container over virtual machine), which inherit the capabilities from both solutions, are becoming more and more attractive in clouds (e.g., running Docker over Amazon EC2 VMs). Recent studies have shown that running applications in either VMs or containers still has significant overhead, especially for I/O intensive workloads. This motivates us to investigate whether the nested virtualization based solution can be adopted to build high-performance computing (HPC) clouds for running MPI applications efficiently and where the bottlenecks lie. To eliminate performance bottlenecks, we propose a high-performance two-layer locality and NUMA aware MPI library, which is able to dynamically detect co-resident containers inside one VM as well as detect co-resident VM inside one host at MPI runtime. Thus the MPI processes across different containers and VMs can communicate to each other by shared memory or Cross Memory Attach (CMA) channels instead of network channel if they are co-resident. We further propose an enhanced NUMA aware hybrid design to utilize InfiniBand loopback based channel to optimize large message transfer across containers when they are running on different sockets. Performance evaluations show that compared with the performance of the state-of-art (1Layer) design, our proposed enhance-hybrid design can bring up to 184%, 81% and 12% benefit on point-to-point, collective operations, and end applications. Compared with the default performance, our enhanced-hybrid design delivers up to 184%, 85% and 16% performance improvement.
- Amazon EC2. http://aws.amazon.com/ec2/.Google Scholar
- M. Ben-Yehuda, M. D. Day, Z. Dubitzky, M. Factor, N. Har'El, A. Gordon, A. Liguori, O. Wasserman, and B.-A. Yassour. The turtles project: Design and implementation of nested virtualization. In OSDI, volume 10, pages 423--436, 2010.Google Scholar
Digital Library
- Chameleon Cloud. https://www.chameleoncloud.org/.Google Scholar
- Cross Memory Attach (CMA). http://kernelnewbies.org/Linuxi_3.2.Google Scholar
- Docker. https://www.docker.com/.Google Scholar
- Y. Dong, X. Yang, J. Li, G. Liao, K. Tian, and H. Guan. High Performance Network Virtualization with SR-IOV. Journal of Parallel and Distributed Computing, 2012. Google Scholar
Digital Library
- W. Felter, A. Ferreira, R. Rajamony, and J. Rubio. An Updated Performance Comparison of Virtual Machines and Linux Containers. Technical Report RC25482 (AUS1407-001), 2014.Google Scholar
- Google Compute Engine (GCE). https://cloud.google.com/compute/.Google Scholar
- W. Huang, J. Liu, B. Abali, and D. K. Panda. A Case for High Performance Computing with Virtual Machines. In Proceedings of the 20th Annual International Conference on Supercomputing, ICS '06, New York, NY, USA, 2006. Google Scholar
Digital Library
- W. Huang, M. J. Koop, Q. Gao, and D. K. Panda. Virtual Machine Aware Communication Libraries for High Performance Computing. In Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC), Reno, USA, 2007. Google Scholar
Digital Library
- W. Huang, J. Liu, M. Koop, B. Abali, and D. Panda. Nomad: Migrating OS-bypass Networks in Virtual Machines. In Proceedings of the 3rd International Conference on Virtual Execution Environments, VEE '07, New York, NY, USA, 2007. Google Scholar
Digital Library
- Z. Huang, R. Ma, J. Li, Z. Chang, and H. Guan. Adaptive and Scalable Optimizations for High Performance SR-IOV. In Proceeding of 2012 IEEE International Conference Cluster Computing (CLUSTER), pages 459--467. IEEE, 2012. Google Scholar
Digital Library
- J. Zhang, X. Lu, J. Jose, M. Li, R. Shi, D. K. Panda. High Performance MPI Library over SR-IOV Enabled InfiniBand Clusters. In Proceedings of International Conference on High Performance Computing (HiPC), Goa, India, December 17-20 2014. Google Scholar
Cross Ref
- J. Zhang, X. Lu, J. Jose, R. Shi, D. K. Panda. Can Inter-VM Shmem Benefit MPI Applications on SR-IOV based Virtualized InfiniBand Clusters? In Proceedings of 20th International Conference Euro-Par 2014 Parallel Processing, Porto, Portugal, August 25-29 2014. Google Scholar
Cross Ref
- Kernel-based Virtual Machine (KVM). http://www.linux-kvm.org/page/Main_Page.Google Scholar
- K. Kim, C. Kim, S.-I. Jung, H.-S. Shin, and J.-S. Kim. Inter-domain Socket Communications Supporting High Performance and Full Binary Compatibility on Xen. In Proceedings of the 4th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE), Seattle, USA, 2008. Google Scholar
Digital Library
- Linux Containers. https://linuxcontainers.org.Google Scholar
- Linux VServer. http://linux-vserver.org.Google Scholar
- J. Liu. Evaluating Standard-Based Self-Virtualizing Devices: A Performance Study on 10 GbE NICs with SR-IOV Support. In Proceeding of 2010 IEEE International Symposium Parallel & Distributed Processing (IPDPS), pages 1--12. IEEE, 2010. Google Scholar
Cross Ref
- J. Liu, W. Huang, B. Abali, and D. K. Panda. High Performance VMM-bypass I/O in Virtual Machines. In Proceedings of the Annual Conference on USENIX '06 Annual Technical Conference, ATC '06, Berkeley, CA, USA, 2006.Google Scholar
Digital Library
- A. C. Macdonell. Shared-Memory Optimizations for Virtual Machines. PhD Thesis. University of Alberta, Edmonton, Alberta, Fall 2011.Google Scholar
- Microsft. Nested virtualization. https://msdn.microsoft.com/en-us/virtualization/hyperv_on_windows/user_guide/nesting.Google Scholar
- MVAPICH2-Virt. http://mvapich.cse.ohio-state.edu/.Google Scholar
- Oracle. Nested virtualization: Achieving up to 2x better aws performance! https://www.ravellosystems.com/blog/nested-virtualization-achieving-up-to-2x-better-aws-performance/.Google Scholar
- OSU Micro-benchmarks. http://mvapich.cse.ohio-state.edu/benchmarks/.Google Scholar
- Photon OS. https://vmware.github.io/photon/.Google Scholar
- S. Soltesz, H. Pötzl, M. E. Fiuczynski, A. Bavier, and L. Peterson. Container-based Operating System Virtualization: A Scalable, High-performance Alternative to Hypervisors. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems (EuroSys '07), Lisbon, Portugal, 2007.Google Scholar
Digital Library
- VMware ESX/ESXi. https://www.vmware.com/products/esxiand-esx/overview.Google Scholar
- VMware vCloud Air. http://vcloud.vmware.com/.Google Scholar
- J. Wang, K.-L. Wright, and K. Gopalan. XenLoop: A Transparent High Performance Inter-vm Network Loopback. In Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC), Boston, USA, 2008. Google Scholar
Digital Library
- M. Xavier, M. Neves, F. Rossi, T. Ferreto, T. Lange, and C. De Rose. Performance Evaluation of Container-Based Virtualization for High Performance Computing Environments. In Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on, pages 233--240, Belfast, Northern Ireland, Feb 2013. doi: 10.1109/PDP.2013.41. Google Scholar
Digital Library
- Xen. http://www.xen.org/.Google Scholar
- J. Zhang, X. Lu, and D. K. Panda. High Performance MPI Library for Container-Based HPC Cloud on InfiniBand Clusters. In 2016 45th International Conference on Parallel Processing (ICPP), Aug 2016. Google Scholar
Cross Ref
- J. Zhang, X. Lu, and D. K. Panda. Performance Characterization of Hypervisor-and Container-Based Virtualization for HPC on SR-IOV Enabled InfiniBand Clusters. In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), May 2016. Google Scholar
Cross Ref
- X. Zhang, S. McIntosh, P. Rohatgi, and J. L. Griffin. XenSocket: A High-throughput Interdomain Transport for Virtual Machines. In Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware (Middleware), Newport Beach, USA, 2007. Google Scholar
Cross Ref
- Y. Zhou, B. Subramaniam, K. Keahey, and J. Lange. Comparison of Virtualization and Containerization Techniques for High Performance Computing. In Proceedings of the 2015 ACM/IEEE conference on Supercomputing, Austin, USA, Nov 2015.Google Scholar
Index Terms
Designing Locality and NUMA Aware MPI Runtime for Nested Virtualization based HPC Cloud with SR-IOV Enabled InfiniBand
Recommendations
Designing Locality and NUMA Aware MPI Runtime for Nested Virtualization based HPC Cloud with SR-IOV Enabled InfiniBand
VEE '17: Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution EnvironmentsHypervisor-based virtualization solutions reveal good security and isolation, while container-based solutions make applications and workloads more portable and distributed in an effective, standardized and repeatable way. Therefore, nested ...
A Low Overhead and Reliable Nested Virtualization VMM for Cloud Computing
WISA '13: Proceedings of the 2013 10th Web Information System and Application ConferenceCommodity operating systems have already gained functionality of virtual machine monitor. Nested virtualization is needed to run these commodity operating systems as virtual machines. Furthermore, with nested virtualization technology, users can run a ...
A Hypervisor Approach to Enable Live Migration with Passthrough SR-IOV Network Devices
Special TopicsSingle-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (physical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly ...







Comments