skip to main content
research-article
Distinguished Paper

Modeling and analysis of remote memory access programming

Published:19 October 2016Publication History
Skip Abstract Section

Abstract

Recent advances in networking hardware have led to a new generation of Remote Memory Access (RMA) networks in which processors from different machines can communicate directly, bypassing the operating system and allowing higher performance. Researchers and practitioners have proposed libraries and programming models for RMA to enable the development of applications running on these networks,

However, the memory models implied by these RMA libraries and languages are often loosely specified, poorly understood, and differ depending on the underlying network architecture and other factors. Hence, it is difficult to precisely reason about the semantics of RMA programs or how changes in the network architecture affect them.

We address this problem with the following contributions: (i) a coreRMA language which serves as a common foundation, formalizing the essential characteristics of RMA programming; (ii) complete axiomatic semantics for that language; (iii) integration of our semantics with an existing constraint solver, enabling us to exhaustively generate coreRMA programs (litmus tests) up to a specified bound and check whether the tests satisfy their specification; and (iv) extensive validation of our semantics on real-world RMA systems. We generated and ran 7441 litmus tests using each of the low-level RMA network APIs: DMAPP, VPI Verbs, and Portals 4. Our results confirmed that our model successfully captures behaviors exhibited by these networks. Moreover, we found RMA programs that behave inconsistently with existing documentation, confirmed by network experts.

Our work provides an important step towards understanding existing RMA networks, thus influencing the design of future RMA interfaces and hardware.

References

  1. P. A. Abdulla, M. F. Atig, Y. Chen, C. Leonardsson, and A. Rezine. Automatic fence insertion in integer programs via predicate abstraction. In Static Analysis - 19th International Symposium, SAS 2012, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Alglave, D. Kroening, V. Nimal, and M. Tautschnig. Software verification for weak memory via program transformation. In Programming Languages and Systems - 22nd European Symposium on Programming, ESOP 2013, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Alglave, D. Kroening, V. Nimal, and D. Poetzl. Don’t sit on the fence—A static analysis approach to automatic fence insertion. In Computer Aided Verification - 26th International Conference, CAV 2014, 2014a. J. Alglave, L. Maranget, and M. Tautschnig. Herding cats: Modelling, simulation, testing, and data mining for weak memory. ACM Trans. Program. Lang. Syst., 36(2):7, 2014b. doi: 10.1145/2627752. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Alverson, D. Roweth, and L. Kaplan. The Gemini system interconnect. In Proc. of the IEEE Symposium on High Performance Interconnects (HOTI’10), pages 83–87. IEEE Computer Society, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony. The PERCS high-performance interconnect. In Proc. of the IEEE Symposium on High Performance Interconnects (HOTI’10), pages 75–82. IEEE Computer Society, Aug. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. W. Barrett, R. B. Brightwell, K. T. T. Pedretti, K. B. Wheeler, K. S. Hemmert, R. E. Riesen, K. D. Underwood, A. B. Maccabe, and T. B. Hudson. The Portals 4.0 network programming interface. Technical report, Sandia National Laboratories, 2012.Google ScholarGoogle Scholar
  7. SAND2012-10087.Google ScholarGoogle Scholar
  8. R. Belli and T. Hoefler. Notified Access: Extending Remote Memory Access Programming Models for Producer-Consumer Synchronization. IEEE, May 2015. Accepted at IPDPS’15. J. C. Blanchette, T. Weber, M. Batty, S. Owens, and S. Sarkar. Nitpicking C++ concurrency. In Proceedings of the 13th International ACM SIGPLAN Symposium on Principles and Practices of Declarative Programming, PPDP ’11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Bouajjani, E. Derevenetc, and R. Meyer. Checking and enforcing robustness against TSO. In Programming Languages and Systems - 22nd European Symposium on Programming, ESOP 2013, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Burckhardt and M. Musuvathi. Effective program verification for relaxed memory models. In Computer Aided Verification, 20th International Conference, CAV 2008, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Burckhardt, R. Alur, and M. M. K. Martin. Checkfence: checking consistency of concurrent data types on relaxed memory models. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Burnim, K. Sen, and C. Stergiou. Sound and complete monitoring of sequential consistency for relaxed memory models. In Tools and Algorithms for the Construction and Analysis of Systems - 17th International Conference, TACAS 2011, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Chen, N. A. Eisley, P. Heidelberger, R. M. Senger, Y. Sugawara, S. Kumar, V. Salapura, D. L. Satterfield, B. Steinmacher-Burow, and J. J. Parker. The IBM Blue Gene/q Interconnection Network and Message Unit. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pages 26:1–26:10, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0771-0. doi: 10.1145/2063384.2063419. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Faanes, A. Bataineh, D. Roweth, T. Court, E. Froese, B. Alverson, T. Johnson, J. Kopnick, M. Higgins, and J. Reinhard. Cray Cascade: A scalable HPC system based on a Dragonfly network. In Proc. of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’12), pages 103:1–103:9. IEEE Computer Society, 2012. ISBN 978-1-4673- 0804-5. R. Gerstenberger, M. Besta, and T. Hoefler. Enabling Highlyscalable Remote Memory Access Programming with MPI-3 One Sided. In Proc. of the ACM/IEEE Supercomputing, SC ’13, pages 53:1–53:12, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Hefty. Scalable fabric interfaces, 2014. OpenFabrics International Developer Workshop 2014.Google ScholarGoogle Scholar
  16. T. Hoefler, J. Dinan, R. Thakur, B. Barrett, P. Balaji, W. Gropp, and K. Underwood. Remote Memory Access Programming in MPI-3. Argonne National Laboratory, Tech. Rep, 2013.Google ScholarGoogle Scholar
  17. N. S. Islam, M. W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang, H. Subramoni, C. Murthy, and D. K. Panda. High performance RDMA-based design of HDFS over InfiniBand. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC ’12, pages 35:1– 35:35, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press. ISBN 978-1-4673-0804-5. D. Jackson. Software Abstractions: Logic, Language, and Analysis. The MIT Press, 2006. ISBN 0262101149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Jiang, J. Kim, and W. J. Dally. Indirect adaptive routing on large scale interconnection networks. SIGARCH Comput. Archit. News, 37(3):220–231, June 2009. ISSN 0163-5964. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Kumar, A. Mamidala, D. A. Faraj, B. Smith, M. Blocksome, B. Cernohous, D. Miller, J. Parker, J. Ratterman, P. Heidelberger, D. Chen, and B. D. Steinmacher-Burrow. PAMI: A parallel active message interface for the Blue Gene/Q supercomputer. In Proc. of the IEEE International Parallel and Distributed Processing Symposium (IPDPS’12), pages 763–773. IEEE Computer Society, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Kuperstein, M. T. Vechev, and E. Yahav. Automatic inference of memory fences. In Proceedings of 10th International Conference on Formal Methods in Computer-Aided Design, FMCAD 2010, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Kuperstein, M. T. Vechev, and E. Yahav. Partial-coherence abstractions for relaxed memory models. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. Li, R. Palmer, M. DeLisi, G. Gopalakrishnan, and R. M. Kirby. Formal specification of MPI 2.0: Case study in specifying a practical concurrent programming API. Sci. Comput. Program., 76(2):65–81, Feb. 2011. ISSN 0167-6423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Linden and P. Wolper. A verification-based approach to memory fence insertion in PSO memory systems. In Tools and Algorithms for the Construction and Analysis of Systems - 19th International Conference, TACAS 2013, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. F. Liu, N. Nedev, N. Prisadnikov, M. T. Vechev, and E. Yahav. Dynamic synthesis for relaxed memory models. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Meshman, A. M. Dan, M. T. Vechev, and E. Yahav. Synthesis of memory fences via refinement propagation. In Static Analysis - 21st International Symposium, SAS 2014, 2014.Google ScholarGoogle Scholar
  26. B. Norris and B. Demsky. CDSchecker: checking concurrent data structures written with C/C++ atomics. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA 2013, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. W. Numrich and J. Reid. Co-array Fortran for parallel programming. SIGPLAN Fortran Forum, 17(2):1–31, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. OpenFabrics Alliance (OFA). OpenFabrics Enterprise Distribution (OFED) www.openfabrics.org, 2014.Google ScholarGoogle Scholar
  29. S. Owens, S. Sarkar, and P. Sewell. A better x86 memory model: x86-TSO. In Theorem Proving in Higher Order Logics, 22nd International Conference, TPHOLs 2009, Munich, Germany, August 17-20, 2009. Proceedings, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C.-S. Park, K. Sen, P. Hargrove, and C. Iancu. Efficient data race detection for distributed memory parallel programs. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pages 51:1–51:12, New York, NY, USA, 2011. ACM. ISBN 978-1- 4503-0771-0. C. S. Park, K. Sen, and C. Iancu. Scaling data race detection for partitioned global address space programs. In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS ’13, pages 47–58, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2130-3. doi: 10.1145/2464996.2465000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Poke and T. Hoefler. Dare: High-performance state machine replication on rdma networks. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, HPDC ’15, pages 107–118, New York, NY, USA, 2015. ACM. ISBN 978-1-4503-3550-8. doi: 10.1145/ 2749246.2749267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. V. Saraswat, G. Almasi, G. Bikshandi, C. Cascaval, D. Cunningham, D. Grove, S. Kodali, I. Peshansky, and O. Tardieu. The asynchronous partitioned global address space model. In AMP ’10: Proceedings of The First Workshop on Advances in Message Passing, June 2010.Google ScholarGoogle Scholar
  33. S. Sarkar, P. Sewell, J. Alglave, L. Maranget, and D. Williams. Understanding POWER multiprocessors. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. C. SPARC International, Inc. The SPARC Architecture Manual: Version 8. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1992. ISBN 0-13-825001-4. The InfiniBand Trade Association. Infiniband Architecture Spec. Vol. 1, Rel. 1.2. InfiniBand Trade Association, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. E. Torlak, M. Vaziri, and J. Dolby. Memsat: checking axiomatic specifications of memory models. In Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. UPC Consortium. UPC language specifications, v1.2. Technical report, Lawrence Berkeley National Laboratory, 2005. LBNL- 59208.Google ScholarGoogle Scholar
  37. M. Valiev, E. J. Bylaska, N. Govind, K. Kowalski, T. P. Straatsma, H. J. Van Dam, D. Wang, J. Nieplocha, E. Apra, T. L. Windus, et al. NWChem: a comprehensive and scalable open-source solution for large scale molecular simulations. Computer Physics Communications, 181(9):1477–1489, 2010.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Modeling and analysis of remote memory access programming

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 51, Issue 10
      OOPSLA '16
      October 2016
      915 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/3022671
      Issue’s Table of Contents
      • cover image ACM Conferences
        OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications
        October 2016
        915 pages
        ISBN:9781450344449
        DOI:10.1145/2983990

      Copyright © 2016 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 19 October 2016

      Check for updates

      Author Tags

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!