Abstract
State-of-the-art MPI libraries rely on locks to guarantee thread-safety. This discourages application developers from using multiple threads to perform MPI operations. In this paper, we propose a high performance, lock-free multi-endpoint MPI runtime, which can achieve up to 40\% improvement for point-to-point operation and one representative collective operation with minimum or no modifications to the existing applications.
- Stampede at Texas Advanced Computing Center. http://www.tacc.utexas.edu/resources/hpc/stampede.Google Scholar
- P. Balaji, D. Buntinas, D. Goodell, W. Gropp, and R. Thakur. Fine-Grained Multithreading Support for Hybrid Threaded MPI Programming. Int. J. High Perform. Comput. Appl., 24(1): 49--57, Feb. 2010. Google Scholar
Digital Library
- S. Bova, C. Breshears, H. Gabb, B. Kuhn, B. Magro, R. Eigenmann, G. Gaertner, S. Salvini, and H. Scott. Parallel Programming with Message Passing and Drectives. Computing in Science Engineering, 3(5): 22--37, 2001. Google Scholar
Digital Library
- Z. Lan, V. Taylor, and G. Bryan. Dynamic Load Balancing for Structured Adaptive Mesh Refinement Applications. In International Conference on Parallel Processing, 2001. Google Scholar
Digital Library
- MVAPICH2. http://mvapich.cse.ohio-state.edu/.Google Scholar
- R. Rabenseifner, G. Hager, and G. Jost. Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. In Parallel, Distributed and Network-based Processing, 2009 17th Euromicro International Conference on, pages 427--436, 2009. Google Scholar
Digital Library
Index Terms
Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems
Recommendations
Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems
PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programmingState-of-the-art MPI libraries rely on locks to guarantee thread-safety. This discourages application developers from using multiple threads to perform MPI operations. In this paper, we propose a high performance, lock-free multi-endpoint MPI runtime, ...
Performance Evaluation of OpenMP and MPI Hybrid Programs on a Large Scale Multi-core Multi-socket Cluster, T2K Open Supercomputer
ICPPW '09: Proceedings of the 2009 International Conference on Parallel Processing WorkshopsNon-uniform memory access (NUMA) systems, where each processor has its own memory, have been popular platform in high-end computing. While some early studies had reported that a flat-MPI programming model outperformed an OpenMP/MPI hybrid programming ...
Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes
PDP '09: Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based ProcessingToday most systems in high-performance computing (HPC) feature a hierarchical hardware design: Shared memory nodes with several multi-core CPUs are connected via a network infrastructure. Parallel programming must combine distributed memory ...







Comments