skip to main content
poster

An overview of CMPI: network performance aware MPI in the cloud

Authors Info & Claims
Published:25 February 2012Publication History
Skip Abstract Section

Abstract

Cloud computing enables users to perform distributed computing tasks on many virtual machines, without owning a physical cluster. Recently, various distributed computing tasks such as scientific applications are being moved from supercomputers and private clusters to public clouds. Message passing interface (MPI) is a key and common component in distributed computing tasks. The virtualized computing environment of the public cloud hides the network topology information from the users, and existing topology-aware optimizations for MPI are no longer feasible in the cloud environment. We propose a network performance aware MPI library named CMPI. CMPI embraces a new model for capturing the network performance among different virtual machines in the cloud. Based on the network performance model, we develop novel network performance aware algorithms for communication operations. This poster gives an overview of CMPI design, and presents some preliminary results on collective operations such as broadcast.We demonstrate the effectiveness of our network performance aware optimizations on Amazon EC2.

References

  1. M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: dynamic flow scheduling for data center networks. In NSDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Y. Gong, B. He, and J. Zhong. Network performance aware mpi collective communication operations in the cloud. Technical Report NTU-PDCC, Dec 2011. URL http://pdcc.ntu.edu.sg/.Google ScholarGoogle Scholar
  3. B. He, M. Yang, Z. Guo, R. Chen, B. Su,W. Lin, and L. Zhou. Comet:batched stream processing for data intensive distributed computing. In SoCC, pages 63--74, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. T. Kielmann, R. F. H. Hofman, H. E. Bal, A. Plaat, and R. A. F.Bhoedjang. Magpie: Mpi's collective communication operations for clustered wide area systems. In PPoPP, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Li, M. Humphrey, D. A. Agarwal, K. R. Jackson, C. van Ingen, and Y. Ryu. escience in the cloud: A modis satellite data reprojection and reduction pipeline in the windows azure platform. In IPDPS, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  6. MPICH2.http://www.mcs.anl.gov/research/projects/mpich2/Google ScholarGoogle Scholar
  7. I. F. W. G. E. L. N. Karonis, B. de Supinski and J. Bresnahan. Exploiting hierarchy in parallel computer networks to optimize collective operation performance. In IPDPS, 2000.Google ScholarGoogle Scholar
  8. J. Pjesivac-Grbovic, T. Angskun, G. Bosilca, G. E. Fagg, E. Gabriel,and J. J. Dongarra. Performance analysis of mpi collective operations.Cluster Computing, 10, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Sistare, R. vandeVaart, and E. Loh. Optimization of mpi collectives on clusters of large-scale smps.In SC,1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Subramoni, K. Kandalla, J. Vienne, S. Sur, B. Barth, K. Tomko,R. McLay, K. Schulz, and D. K. Panda. Design and evaluation of network topology-/speed-aware broadcast algorithms for infiniband clusters. In IEEE Cluster, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Wang, Q. Jing, R. Chen, B. He, Z. Qian, and L. Zhou. Distributed systems meet economics: pricing in the cloud. In HotCloud, pages 1--6, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

(auto-classified)
  1. An overview of CMPI: network performance aware MPI in the cloud

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 47, Issue 8
      PPOPP '12
      August 2012
      334 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2370036
      Issue’s Table of Contents
      • cover image ACM Conferences
        PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
        February 2012
        352 pages
        ISBN:9781450311601
        DOI:10.1145/2145816

      Copyright © 2012 Authors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 February 2012

      Check for updates

      Qualifiers

      • poster

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!