skip to main content
poster

POSTER: Cache-Oblivious MPI All-to-All Communications on Many-Core Architectures

Authors Info & Claims
Published:26 January 2017Publication History
Skip Abstract Section

Abstract

In the many-core era, the performance of MPI collectives is more dependent on the intra-node communication component. However, the communication algorithms generally inherit from the inter-node version and ignore the cache complexity. We propose cache-oblivious algorithms for MPI all-to-all operations, in which data blocks are copied into the receive buffers in Morton order to exploit data locality. Experimental results on different many-core architectures show that our cache-oblivious implementations significantly outperform the naive implementations based on shared heap and the highly optimized MPI libraries.

References

  1. M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. ACM Transactions on Algorithms (TALG), 8 (1): 4, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Li, T. Hoefler, and M. Snir. NUMA-aware shared-memory collective communication for MPI. In Proceedings of the 22nd international symposium on High-performance parallel and distributed computing, pages 85--96. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Li, T. Hoefler, C. Hu, and M. Snir. Improved MPI collectives for MPI processes in shared address spaces. Cluster Computing, 17 (4): 1139--1155, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. M. Morton. A computer oriented geodetic data base and a new technique in file sequencing. International Business Machines Company New York, 1966.Google ScholarGoogle Scholar
  5. 012)]MPIMPI Forum. MPI: A Message-Passing Interface standard. Version 3.0, September 2012.Google ScholarGoogle Scholar
  6. R. Thakur, R. Rabenseifner, and W. Gropp. Optimization of collective communication operations in MPICH. International Journal of High Performance Computing Applications, 19 (1): 49--66, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. POSTER: Cache-Oblivious MPI All-to-All Communications on Many-Core Architectures

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 52, Issue 8
        PPoPP '17
        August 2017
        442 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/3155284
        Issue’s Table of Contents
        • cover image ACM Conferences
          PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
          January 2017
          476 pages
          ISBN:9781450344937
          DOI:10.1145/3018743

        Copyright © 2017 Owner/Author

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 January 2017

        Check for updates

        Qualifiers

        • poster

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!