skip to main content
short-paper

Supporting High Performance Molecular Dynamics in Virtualized Clusters using IOMMU, SR-IOV, and GPUDirect

Published:14 March 2015Publication History
Skip Abstract Section

Abstract

Cloud Infrastructure-as-a-Service paradigms have recently shown their utility for a vast array of computational problems, ranging from advanced web service architectures to high throughput computing. However, many scientific computing applications have been slow to adapt to virtualized cloud frameworks. This is due to performance impacts of virtualization technologies, coupled with the lack of advanced hardware support necessary for running many high performance scientific applications at scale.

By using KVM virtual machines that leverage both Nvidia GPUs and InfiniBand, we show that molecular dynamics simulations with LAMMPS and HOOMD run at near-native speeds. This experiment also illustrates how virtualized environments can support the latest parallel computing paradigms, including both MPI+CUDA and new GPUDirect RDMA functionality. Specific findings show initial promise in scaling of such applications to larger production deployments targeting large scale computational workloads.

References

  1. Amazon elastic compute cloud (Amazon EC2). Website, August 2010. URL http://aws.amazon.com/ec2/.Google ScholarGoogle Scholar
  2. NVIDIA GPUDirect. Website, November 2014. URL https://developer.nvidia.com/gpudirect.Google ScholarGoogle Scholar
  3. Mellanox Neutron Plugin. Website, November 2014. URL https://wiki.openstack.org/wiki/Mellanox-Neutron.Google ScholarGoogle Scholar
  4. Getting Xen working for Intel(R) Xeon Phi(tm) Coprocessor. Website, November 2014. URL https://software.intel.com/en-us/articles/getting-xen-working-for-intelr-xeonphitm-coprocessor.Google ScholarGoogle Scholar
  5. AWS high performance computing. Website, November 2014. URL http://aws.amazon.com/hpc/.Google ScholarGoogle Scholar
  6. Google Cloud Platform. Website, November 2014. URL https://cloud.google.com/.Google ScholarGoogle Scholar
  7. OpenStack cloud software. Website, November 2014. URL http://openstack.org.Google ScholarGoogle Scholar
  8. OpenStack flavors. Website, November 2014. URL http://docs.openstack.org/openstackops/content/flavors.html.Google ScholarGoogle Scholar
  9. AMD Corporation. AMD I/O virtualization technology (IOMMU) specification. Technical report, AMD Corporation, 2009.Google ScholarGoogle Scholar
  10. J. Anderson, A. Keys, C. Phillips, T. Dac Nguyen, and S. Glotzer. HOOMD-blue, general-purpose many-body dynamics on the GPU. In APS Meeting Abstracts, volume 1, page 18008, 2010.Google ScholarGoogle Scholar
  11. ARM Limited. ARM system memory management unit architecture specification. Technical report, ARM Limited, 2013.Google ScholarGoogle Scholar
  12. M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. A view of cloud computing. Commun. ACM, 53 :50--58, Apr. 2010. ISSN 0001-0782. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, et al. The landscape of parallel computing research: A view from Berkeley. Technical report, Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, 2006.Google ScholarGoogle Scholar
  14. S. Crago, K. Dunn, P. Eads, L. Hochstein, D.-I. Kang, M. Kang, D. Modium, K. Singh, J. Suh, and J. P.Walters. Heterogeneous cloud computing. In Cluster Computing (CLUSTER), 2011 IEEE International Conference on, pages 378--385. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Dongarra, H. Meuer, and E. Strohmaier. Top 500 supercomputers. Website, November 2014. URL http://top500. org/.Google ScholarGoogle Scholar
  16. J. Duato, A. J. Pena, F. Silla, J. C. Fernández, R. Mayo, and E. S. Quintana-Orti. Enabling CUDA acceleration within virtual machines using rCUDA. In High Performance Computing (HiPC), 2011 18th International Conference on, pages 1--10. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Fox, G. von Laszewski, J. Diaz, K. Keahey, J. Fortes, R. Figueiredo, S. Smallen, W. Smith, and A. Grimshaw. FutureGrid-a reconfigurable testbed for Cloud, HPC and Grid computing. Contemporary High Performance Computing: From Petascale toward Exascale, Computational Science. Chapman and Hall/CRC, 2013.Google ScholarGoogle Scholar
  18. N. Huber, M. von Quast, M. Hauck, and S. Kounev. Evaluating and modeling virtualization performance overhead for cloud environments. In CLOSER, pages 563--573, 2011.Google ScholarGoogle Scholar
  19. R. Jennings. Cloud Computing with the Windows Azure Platform. John Wiley & Sons, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Jha, J. Qiu, A. Luckow, P. K. Mantha, and G. C. Fox. A tale of two data-intensive paradigms: Applications, abstractions, and architectures. In Proceedings of the 3rd International Congress on Big Data, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Jose, M. Li, X. Lu, K. C. Kandalla, M. D. Arnold, and D. K. Panda. SR-IOV support for virtualization on InfiniBand clusters: Early experience. In Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on, pages 385--392. IEEE, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Keahey, J. Mambretti, D. K. Panda, P. Rad, W. Smith, and D. Stanzione. NSF Chameleon cloud. Website, November 2014. URL http://www.chameleoncloud.org/.Google ScholarGoogle Scholar
  23. J. Liu. Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support. In Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--12, April 2010.Google ScholarGoogle ScholarCross RefCross Ref
  24. P. Luszczek, E. Meek, S. Moore, D. Terpstra, V. M. Weaver, and J. Dongarra. Evaluation of the HPC challenge benchmarks in virtualized environments. In Proceedings of the 2011 International Conference on Parallel Processing - Volume 2, Euro-Par'11, pages 436--445, Berlin, Heidelberg, 2012. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. L. Moore, C. Baru, D. Baxter, G. C. Fox, A. Majumdar, P. Papadopoulos, W. Pfeiffer, R. S. Sinkovits, S. Strande, M. Tatineni, et al. Gateways to discovery: Cyberinfrastructure for the long tail of science. In Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment, page 39. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Musleh, V. Pai, J. P.Walters, A. J. Younge, and S. P. Crago. Bridging the virtualization performance gap for HPC using SR-IOV for InfiniBand. In Proceedings of the 7th IEEE International Conference on Cloud Computing (CLOUD 2014), Anchorage, AK, 2014. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Plimpton, P. Crozier, and A. Thompson. LAMMPS-largescale atomic/molecular massively parallel simulator. Sandia National Laboratories, 2007.Google ScholarGoogle Scholar
  28. L. Ramakrishnan, R. S. Canon, K. Muriki, I. Sakrejda, and N. J. Wright. Evaluating interconnect and virtualization performance for high performance computing. SIGMETRICS Perform. Eval. Rev., 40(2):55--60, Oct. 2012. ISSN 0163-5999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Righini. Enabling Intel R virtualization technology features and benefits. Technical report, Intel Corporation, 2010.Google ScholarGoogle Scholar
  30. T. P. P. D. L. Ruivo, G. B. Altayo, G. Garzoglio, S. Timm, H. Kim, S.-Y. Noh, and I. Raicu. Exploring InfiniBand hardware virtualization in OpenNebula towards efficient highperformance computing. In CCGRID, pages 943--948, 2014.Google ScholarGoogle Scholar
  31. S. Seelam, L. Fong, A. Tantawi, J. Lewars, J. Divirgilio, and K. Gildea. Extreme scale computing: Modeling the impact of system noise in multicore clustered systems. In Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--12, April 2010. .Google ScholarGoogle ScholarCross RefCross Ref
  32. G. Shainer, A. Ayoub, P. Lui, T. Liu, M. Kagan, C. R. Trott, G. Scantlen, and P. S. Crozier. The development of Mellanox/NVIDIA GPUDirect over InfiniBand-a new model for GPU to GPU communications. Computer Science-Research and Development, 26(3--4):267--273, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Y. Suzuki, S. Kato, H. Yamada, and K. Kono. GPUvm: why not virtualizing GPUs at the hypervisor? In Proceedings of the 2014 USENIX conference on USENIX Annual Technical Conference, pages 109--120. USENIX Association, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. K. Tian, Y. Dong, and D. Cowperthwaite. A full GPU virtualization solution with mediated pass-through. In Proc. USENIX ATC, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. L. Vu, H. Sivaraman, and R. Bidarkar. GPU virtualization for high performance general purpose computing on the ESX hypervisor. In Proceedings of the High Performance Computing Symposium, HPC '14, pages 2:1--2:8, San Diego, CA, USA, 2014. Society for Computer Simulation International. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. P. Walters, A. J. Younge, D.-I. Kang, K.-T. Yao, M. Kang, S. P. Crago, and G. C. Fox. GPU-Passthrough performance: A comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications. In Proceedings of the 7th IEEE International Conference on Cloud Computing (CLOUD 2014), Anchorage, AK, 2014. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. K. Yelick, S. Coghlan, B. Draney, R. S. Canon, et al. The Magellan report on cloud computing for science. Technical report, US Department of Energy, 2011.Google ScholarGoogle Scholar
  38. A. J. Younge, R. Henschel, J. T. Brown, G. von Laszewski, J. Qiu, and G. C. Fox. Analysis of Virtualization Technologies for High Performance Computing Environments. In Proceedings of the 4th International Conference on Cloud Computing (CLOUD 2011), Washington, DC, 2011. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. J. Younge, J. P. Walters, S. Crago, and G. C. Fox. Evaluating GPU passthrough in Xen for high performance cloud computing. In High-Performance Grid and Cloud Computing Workshop at the 28th IEEE International Parallel and Distributed Processing Symposium, Pheonix, AZ, 05 2014. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Supporting High Performance Molecular Dynamics in Virtualized Clusters using IOMMU, SR-IOV, and GPUDirect

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!