10.1145/3365609.3365851acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Public Access

The Case for a Network Fast Path to the CPU

Online:14 November 2019Publication History

ABSTRACT

For the past two decades, the communication channel between the NIC and CPU has largely remained the same---issuing memory requests across a slow PCIe peripheral interconnect. Today, with application service times and network fabric delays measuring hundreds of nanoseconds, the NIC--CPU interface can account for most of the overhead when programming modern warehouse-scale computers.

In this paper, we tackle this issue head-on by proposing a design for a fast path between the NIC and CPU, called Lightning NIC (L-NIC), which deviates from the established norms of offloading computation onto the NIC (inflating latency), or using centralized dispatcher cores for packet scheduling (limiting throughput). L-NIC adds support for a fast path from the network to the core of the CPU by writing and reading packets directly to/from the CPU register file. This approach minimizes network IO latency, providing significant performance improvements over traditional NIC--CPU interfaces.

Supplemental Material

p52-ibanez.mp4

References

  1. Amazon: Annapurna Labs. http://www.annapurnalabs.com/. Accessed on 06/28/2019.Google ScholarGoogle Scholar
  2. AWS Lambda. https://aws.amazon.com/lambda/. Accessed on 06/28/2019.Google ScholarGoogle Scholar
  3. Azure Functions. https://azure.microsoft.com/en-us/services/functions/. Accessed on 06/28/2019.Google ScholarGoogle Scholar
  4. Google Cloud Functions. https://cloud.google.com/functions/. Accessed on 06/28/2019.Google ScholarGoogle Scholar
  5. Mellanox Technologies: Introducing 200G HDR InfiniBand Solutions. https://www.mellanox.com/related-docs/whitepapers/WP_Introducing_200G_HDR_IniniBand_Solutions.pdf. Accessed on 6/28/2019.Google ScholarGoogle Scholar
  6. Reversi (Wikipedia). https://en.wikipedia.org/wiki/Reversi. Accessed on 06/28/2019.Google ScholarGoogle Scholar
  7. Serverless Use Cases. https://serverless.com/learn/use-cases/. Accessed on 06/28/2019.Google ScholarGoogle Scholar
  8. Bosshart, P., Daly, D., Gibb, G., Izzard, M., McKeown, N., Rexford, J., Schlesinger, C., Talayco, D., Vahdat, A., Varghese, G., and Walker, D. P4: Programming Protocol-independent Packet Processors. ACM SIGCOMM CCR 44, 3 (July 2014), 87--95.Google ScholarGoogle Scholar
  9. Bosshart, P., Gibb, G., Kim, H.-S., Varghese, G., McKeown, N., Izzard, M., Mujica, F., and Horowitz, M. Forwarding Metamorphosis: Fast Programmable Match-action Processing in Hardware for SDN. In ACM SIGCOMM (2013).Google ScholarGoogle Scholar
  10. Cadar, C., Godefroid, P., Khurshid, S., Pasareanu, C. S., Sen, K., Tillmann, N., and Visser, W Symbolic Execution for Software Testing in Practice: Preliminary Assessment. In IEEE ICSE (2011).Google ScholarGoogle Scholar
  11. Caulfield, A. M., Chung, E. S., Putnam, A., Angepat, H., Fowers, J., Haselman, M., Heil, S., Humphrey, M., Kaur, P., Kim, J.-Y., Lo, D., Massengill, T., Ovtcharov, K., Papamichael, M., Woods, L., Lanka, S., Chiou, D., and Burger, D. A Cloud-scale Acceleration Architecture. In IEEE/ACM MICRO (2016).Google ScholarGoogle Scholar
  12. Dang, H. T., Sciascia, D., Canini, M., Pedone, F., and Soulé, R. NetPaxos: Consensus at Network Speed. In ACM SOSR (2015).Google ScholarGoogle Scholar
  13. Dean, J., and Ghemawat, S. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM 51, 1 (Jan. 2008), 107--113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Firestone, D., Putnam, A., Mundkur, S., Chiou, D., Dabagh, A., Andrewartha, M., Angepat, H., Bhanu, V., Caulfield, A., Chung, E., Chandrappa, H. K., Chaturmohta, S., Humphrey, M., Lavier, J., Lam, N., Liu, F., Ovtcharov, K., Padhye, J., Popuri, G., Raindel, S., Sapre, T., Shaw, M., Silva, G., Sivakumar, M., Srivastava, N., Verma, A., Zuhair, Q., Bansal, D., Burger, D., Vaid, K., Maltz, D. A., and Greenberg, A. Azure Accelerated Networking: SmartNICs in the Public Cloud. In USENIX NSDI (2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Handley, M., Raiciu, C., Agache, A., Voinescu, A., Moore, A. W., Antichi, G., and Wójcik, M. Re-architecting Datacenter Networks and Stacks for Low Latency and High Performance. In ACM SIGCOMM (2017).Google ScholarGoogle Scholar
  16. Jetley, P., Gioachin, F., Mendes, C., Kale, L. V., and Quinn, T. Massively parallel cosmological simulations with changa. In 2008 IEEE International Symposium on Parallel and Distributed Processing (2008), IEEE, pp. 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  17. Jin, X., Li, X., Zhang, H., Soulé, R., Lee, J., Foster, N., Kim, C., and Stoica, I. NetCache: Balancing Key-Value Stores with Fast In-Network Caching. In SOSP (2017).Google ScholarGoogle Scholar
  18. Kaffes, K., Chong, T., Humphries, J. T., Belay, A., Mazières, D., and Kozyrakis, C. Shinjuku: Preemptive Scheduling for Microsecond-scale Tail Latency. In USENIX NSDI (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kiefer, J. Sequential Minimax Search for a Maximum. AMS 4, 3 (1953), 502--506.Google ScholarGoogle Scholar
  20. Loveland, D. W. Automated Theorem Proving: A Logical Basis. Elsevier, 2016.Google ScholarGoogle Scholar
  21. Miao, R., Zeng, H., Kim, C., Lee, J., and Yu, M. SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs. In ACM SIGCOMM (2017).Google ScholarGoogle Scholar
  22. Mittal, R., Lam, V. T., Dukkipati, N., Blem, E., Wassel, H., Ghobadi, M., Vahdat, A., Wang, Y., Wetherall, D., and Zats, D. TIMELY: RTT-based Congestion Control for the Datacenter. In ACM SIGCOMM (2015).Google ScholarGoogle Scholar
  23. Montazeri, B., Li, Y., Alizadeh, M., and Ousterhout, J. Homa: A Receiver-driven Low-latency Transport Protocol Using Network Priorities. In ACM SIGCOMM (2018).Google ScholarGoogle Scholar
  24. N. Dukkipatti, e. a. PicNIC: Predictable Virtualized NIC. In ACM SIGCOMM (2019).Google ScholarGoogle Scholar
  25. Neugebauer, R., Antichi, G., Zazo, J. F., Audzevich, Y., López-Buedo, S., and Moore, A. W. Understanding PCIe Performance for End Host Networking. In ACM SIGCOMM (2018).Google ScholarGoogle Scholar
  26. Ousterhout, A., Fried, J., Behrens, J., Belay, A., and Balakrishnan, H. Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads. In USENIX NSDI (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ousterhout, J., Gopalan, A., Gupta, A., Kejriwal, A., Lee, C., Montazeri, B., Ongaro, D., Park, S. J., Qin, H., Rosenblum, M., Rumble, S., Stutsman, R., and Yang, S. The RAMCloud Storage System. ACM TOCS 33, 3 (Aug. 2015), 7:1--7:55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Qin, H., Li, Q., Speiser, J., Kraft, P., and Ousterhout, J. Arachne: Core-aware Thread Management. In USENIX OSDI (2018).Google ScholarGoogle Scholar
  29. Zaharia, M., Xin, R. S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M. J., et al. Apache Spark: A Unified Engine for Big Data Processing. Communications of the ACM 59, 11 (2016), 56--65.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Zhang, W. State-space Search: Algorithms, Complexity, Extensions, and Applications. Springer Science & Business Media, 1999.Google ScholarGoogle Scholar

Index Terms

  1. The Case for a Network Fast Path to the CPU

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      ACM Conferences cover image
      HotNets '19: Proceedings of the 18th ACM Workshop on Hot Topics in Networks
      November 2019
      176 pages
      ISBN:9781450370202
      DOI:10.1145/3365609

      Copyright © 2019 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Online: 14 November 2019
      • Published: 14 November 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Qualifiers

      • research-article
      • Research
      • Refereed limited

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!