skip to main content
10.1145/3337821.3337910acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article
Open Access

Breaking Band: A Breakdown of High-performance Communication

Published:05 August 2019Publication History

ABSTRACT

The critical path of internode communication on large-scale systems is composed of multiple components. When a supercomputing application initiates the transfer of a message using a high-level communication routine such as an MPI_Send, the payload of the message traverses multiple software stacks, the I/O subsystem on both the host and target nodes, and network components such as the switch. In this paper, we analyze where, why, and how much time is spent on the critical path of communication by modeling the overall injection overhead and end-to-end latency of a system. We focus our analysis on the performance of small messages since fine-grained communication is becoming increasingly important with the growing trend of an increasing number of cores per node. The analytical models present an accurate and detailed breakdown of time spent in internode communication. We validate the models on Arm ThunderX2-based servers connected with Mellanox InfiniBand. This is the first work of this kind on Arm. Alongside our breakdown, we describe the methodology to measure the time spent in each component so that readers with access to precise CPU timers and a PCIe analyzer can measure breakdowns on systems of their interest. Such a breakdown is crucial for software developers, system architects, and researchers to guide their optimization efforts. As researchers ourselves, we use the breakdown to simulate the impacts and discuss the likelihoods of a set of optimizations that target the bottlenecks in today's high-performance communication.

References

  1. {n. d.}. Teledyne LeCroy Summit T3-16 Analyzer. https://teledynelecroy.com/protocolanalyzer/pci-express/summit-t3-16-analyzerGoogle ScholarGoogle Scholar
  2. 2018. Top 500 High Performance Computing Platform Interconnect. Retrieved June 7, 2019 from http://www.mellanox.com/solutions/hpc/top500.phpGoogle ScholarGoogle Scholar
  3. 2019. OSU Micro-Benchmarks 5.6.1. http://mvapich.cse.ohio-state.edu/benchmarks/Google ScholarGoogle Scholar
  4. 2019. UCS profiling. https://github.com/open/ucx/wiki/ProfilingGoogle ScholarGoogle Scholar
  5. Yuichiro Ajima et al. 2018. The Tofu Interconnect D. In 2018 IEEE Intl. Conf. on Cluster Computing (CLUSTER). IEEE, 646--654.Google ScholarGoogle Scholar
  6. Mohammad Alian et al. 2018. Simulating PCI-Express Interconnect for Future System Exploration. In 2018 Intl. Symp. on Work. Char. (IISWC). IEEE, 168--178.Google ScholarGoogle Scholar
  7. Sudeep Bhoja et al. 2014. FEC codes for 400 Gbps 802.3 bs. IEEE P802. 3bs 400 (2014).Google ScholarGoogle Scholar
  8. Nathan L Binkert et al. 2006. Integrated network interfaces for high-bandwidth TCP/IP. ACM Sigplan Not. 41, 11 (2006), 315--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Henri Casanova et al. 2014. Versatile, Scalable, and Accurate Simulation of Distributed Applications and Platforms. J. Parallel and Distrib. Comput. 74, 10 (June 2014), 2899--2917. http://hal.inria.fr/hal-01017319Google ScholarGoogle Scholar
  10. Greg Casey. 2018. Gen-Z: High-performance interconnect for the data-centric future. https://www.opencompute.org/files/OCP-GenZ-March-2018-final.pdfGoogle ScholarGoogle Scholar
  11. Eric G. 2014. What public disclosures has Intel made about Knights Landing? Retrieved June 7, 2019 from https://software.intel.com/en-us/articles/what-disclosures-has-intel-made-about-knights-landingGoogle ScholarGoogle Scholar
  12. Ali Ghiasi et al. 2012. Investigation of PAM-4/6/8 signaling and FEC for 100 Gb/s serial transmission. IEEE P802. 3bm 40 (2012).Google ScholarGoogle Scholar
  13. Adrian Jackson et al. 2019. Evaluating the Arm Ecosystem for High Performance Computing. In Proc. of the Platform for Advanced Scientific Computing Conf. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Anuj Kalia et al. 2016. Design Guidelines for High Performance {RDMA} Systems. In 2016 {USENIX} Annual Technical Conf. ({USENIX} {ATC} 16). 437--450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Patrick Kennedy. 2018. Cavium ThunderX2 Review and Benchmarks a Real Arm Server Option. Retrieved June 7, 2019 from https://www.servethehome.com/cavium-thunderx2-review-benchmarks-real-arm-server-option/Google ScholarGoogle Scholar
  16. Steen Larsen et al. 2015. Reevaluation of PIO with write-combining buffers to improve I/O performance on cluster systems.. In NAS. 345--346.Google ScholarGoogle Scholar
  17. Guangdeng Liao et al. 2009. Performance measurement of an integrated NIC architecture with 10GbE. In 2009 17th IEEE Symp. on High Perf. Inter. IEEE, 52--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Arm Ltd. 2019. ARMv8-A Memory types. Retrieved June 7, 2019 from https://developer.arm.com/docs/100941/latest/memory-typesGoogle ScholarGoogle Scholar
  19. Rolf Neugebauer et al. 2018. Understanding PCIe performance for end host networking. In Proc. of the 2018 Conf. of the ACM Special Interest Group on Data Communications. ACM, 327--341. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Nikela Papadopoulou et al. 2017. A performance study of UCX over InfiniBand. In Proc. of the 17th IEEE/ACM Intl. Symp. on Cluster, Cloud and Grid Computing. IEEE Press, 345--354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ken Raffenetti et al. 2017. Why is MPI so slow?: Analyzing the fundamental limits in implementing mpi-3.1. In Proc. of the Intl. Conf. for High Performance Computing, Networking, Storage and Analysis. ACM, 62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Stephen M Rumble et al. 2011. It's Time for Low Latency.. In HotOS, Vol. 13. 11--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Pavel Shamis et al. 2015. UCX: an open source framework for HPC network APIs and beyond. In 2015 IEEE 23rd Ann. Symp. on High-Perf. Inter.. IEEE, 40--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Phil Sun. 2017. 100Gb/sSingle-lane SERDES Discussion. IEEE P802.3 New Ethernet Applications Ad Hoc (2017).Google ScholarGoogle Scholar
  25. Rajeev Thakur et al. 2010. MPI at Exascale. Proc. of SciDAC 2 (2010), 14--35.Google ScholarGoogle Scholar

Index Terms

  1. Breaking Band: A Breakdown of High-performance Communication

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ICPP '19: Proceedings of the 48th International Conference on Parallel Processing
          August 2019
          1107 pages
          ISBN:9781450362955
          DOI:10.1145/3337821

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 5 August 2019

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate91of313submissions,29%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader