Abstract
Networkswith Remote DirectMemoryAccess (RDMA) support are becoming increasingly common. RDMA, however, offers a limited programming interface to remote memory that consists of read, write and atomic operations. With RDMA alone, completing the most basic operations on remote data structures often requires multiple round-trips over the network. Data-intensive systems strongly desire higher-level communication abstractions that supportmore complex interaction patterns.
A natural candidate to consider is MPI, the de facto standard for developing high-performance applications in the HPC community. This paper critically evaluates the communication primitives of MPI and shows that using MPI in the context of a data processing system comes with its own set of insurmountable challenges. Based on this analysis, we propose a new communication abstraction named RDMO, or Remote DirectMemory Operation, that dispatches a short sequence of reads, writes and atomic operations to remote memory and executes them in a single round-trip.
- G. Alonso, C. Binnig, et al. DPI: the data processing interface for modern networks. In CIDR, 2019.Google Scholar
- C. Barthels et al. Rack-Scale In-Memory Join Processing Using RDMA. In SIGMOD, 2015.Google Scholar
Digital Library
- C. Barthels et al. Distributed Join Algorithms on Thousands of Cores. PVLDB, 2017.Google Scholar
Digital Library
- R. Belli and T. Hoefler. Notified Access: Extending Remote Memory Access Programming Models for Producer-Consumer Synchronization. In IPDPS, 2015.Google Scholar
Digital Library
- C. Binnig et al. The End of Slow Networks: It's Time for a Redesign. PVLDB, 2016.Google Scholar
Digital Library
- A. Costea et al. VectorH: Taking SQL-on-Hadoop to the Next Level. In SIGMOD, 2016.Google Scholar
- Global Arrays. http://hpc.pnl.gov/globalarrays.Google Scholar
- J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1993.Google Scholar
Digital Library
- A. Kalia et al. Using RDMA Efficiently for Key-value Services. SIGCOMM, 2014.Google Scholar
Digital Library
- F. Liu et al. Design and Evaluation of an RDMA-aware Data Shuffling Operator for Parallel Database Systems. In EuroSys, 2017.Google Scholar
Digital Library
- F. Liu et al. Chasing similarity: Distribution-aware aggregation scheduling. PVLDB, 12(3):292--306, 2018.Google Scholar
Digital Library
- MVAPICH. http://mvapich.cse.ohio-state.edu/.Google Scholar
- OpenMPI. https://www.open-mpi.org/.Google Scholar
- R. Rajwar and J. R. Goodman. Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution. In MICRO, 2001.Google Scholar
Digital Library
- W. R¨odiger et al. High-speed Query Processing over High-speed Networks. PVLDB, 2015.Google Scholar
Digital Library
- K. Umamageswaran et al. Exadata Deep Dive: Architecture and Internals. Oracle OpenWorld, 2017.Google Scholar
- E. Zamanian et al. The End of a Myth: Distributed Transaction Can Scale. PVLDB, 2017.Google Scholar
Digital Library
Recommendations
Communication Optimization Beyond MPI
IPDPSW '11: Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD ForumThe Message Passing Interface (MPI) is the de-facto standard for parallel processing on high-performance computing systems. As a result, significant research effort has been spent on optimizing the performance of MPI implementations. However, MPI's ...
MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory
Hybrid parallel programming with the message passing interface (MPI) for internode communication in conjunction with a shared-memory programming model to manage intranode parallelism has become a dominant approach to scalable parallel programming. While ...
MPI vs Fortran coarrays beyond 100k cores: 3D cellular automata
Highlights- A Fortran 2018 coarrays cellular automata (CA) library for HPC is presented.
- ...
AbstractFortran coarrays are an attractive alternative to MPI due to a familiar Fortran syntax, single sided communications and implementation in the compiler. Scaling of coarrays is compared in this work to MPI, using cellular automata (CA) ...






Comments