Abstract
Cloud computing enables users to perform distributed computing tasks on many virtual machines, without owning a physical cluster. Recently, various distributed computing tasks such as scientific applications are being moved from supercomputers and private clusters to public clouds. Message passing interface (MPI) is a key and common component in distributed computing tasks. The virtualized computing environment of the public cloud hides the network topology information from the users, and existing topology-aware optimizations for MPI are no longer feasible in the cloud environment. We propose a network performance aware MPI library named CMPI. CMPI embraces a new model for capturing the network performance among different virtual machines in the cloud. Based on the network performance model, we develop novel network performance aware algorithms for communication operations. This poster gives an overview of CMPI design, and presents some preliminary results on collective operations such as broadcast.We demonstrate the effectiveness of our network performance aware optimizations on Amazon EC2.
- M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: dynamic flow scheduling for data center networks. In NSDI, 2010. Google Scholar
Digital Library
- Y. Gong, B. He, and J. Zhong. Network performance aware mpi collective communication operations in the cloud. Technical Report NTU-PDCC, Dec 2011. URL http://pdcc.ntu.edu.sg/.Google Scholar
- B. He, M. Yang, Z. Guo, R. Chen, B. Su,W. Lin, and L. Zhou. Comet:batched stream processing for data intensive distributed computing. In SoCC, pages 63--74, 2010. Google Scholar
Digital Library
- T. Kielmann, R. F. H. Hofman, H. E. Bal, A. Plaat, and R. A. F.Bhoedjang. Magpie: Mpi's collective communication operations for clustered wide area systems. In PPoPP, 1999. Google Scholar
Digital Library
- J. Li, M. Humphrey, D. A. Agarwal, K. R. Jackson, C. van Ingen, and Y. Ryu. escience in the cloud: A modis satellite data reprojection and reduction pipeline in the windows azure platform. In IPDPS, 2010.Google Scholar
Cross Ref
- MPICH2.http://www.mcs.anl.gov/research/projects/mpich2/Google Scholar
- I. F. W. G. E. L. N. Karonis, B. de Supinski and J. Bresnahan. Exploiting hierarchy in parallel computer networks to optimize collective operation performance. In IPDPS, 2000.Google Scholar
- J. Pjesivac-Grbovic, T. Angskun, G. Bosilca, G. E. Fagg, E. Gabriel,and J. J. Dongarra. Performance analysis of mpi collective operations.Cluster Computing, 10, 2007. Google Scholar
Digital Library
- S. Sistare, R. vandeVaart, and E. Loh. Optimization of mpi collectives on clusters of large-scale smps.In SC,1999. Google Scholar
Digital Library
- H. Subramoni, K. Kandalla, J. Vienne, S. Sur, B. Barth, K. Tomko,R. McLay, K. Schulz, and D. K. Panda. Design and evaluation of network topology-/speed-aware broadcast algorithms for infiniband clusters. In IEEE Cluster, 2011. Google Scholar
Digital Library
- H. Wang, Q. Jing, R. Chen, B. He, Z. Qian, and L. Zhou. Distributed systems meet economics: pricing in the cloud. In HotCloud, pages 1--6, 2010. Google Scholar
Digital Library
Index Terms
(auto-classified)An overview of CMPI: network performance aware MPI in the cloud
Recommendations
An overview of CMPI: network performance aware MPI in the cloud
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel ProgrammingCloud computing enables users to perform distributed computing tasks on many virtual machines, without owning a physical cluster. Recently, various distributed computing tasks such as scientific applications are being moved from supercomputers and ...
Scalability of a Parallel Application in Hybrid Cloud
CISIS '14: Proceedings of the 2014 Eighth International Conference on Complex, Intelligent and Software Intensive SystemsCloud computing is a convenient, on demand, model which relies on shared pool of computing resources that can be rapidly provisioned and needs minimal management effort. As cloud computing obtains popularity nowadays, customers are looking for cloud ...
e-Clouds: Scientific Computing as a Service
CISIS '13: Proceedings of the 2013 Seventh International Conference on Complex, Intelligent, and Software Intensive SystemsWe present the e-Clouds project aimed at allowing researchers to easily execute many applications on public Infrastructure as a Service (IaaS) solutions. Designed for being a Software as a Service (SaaS) marketplace for scientific applications, e-Clouds ...







Comments