10.1145/3485983.3494873acmconferencesArticle/Chapter ViewAbstractPublication PagesconextConference Proceedingsconference-collections

Burst-tolerant datacenter networks with Vertigo

Online:03 December 2021Publication History

ABSTRACT

Microsecond-scale congestion events, known as microbursts, are a main cause of packet loss and poor application performance in today's datacenters. Given the low network utilization in datacenters, one would expect packet deflection, in-situ re-routing of packets that arrive at a full buffer to a different port, to effectively prevent packet loss. However, if deployed naively, deflection leads to excessive packet re-ordering, exacerbated congestion, and head-of-the-line blocking in switch buffers. In this study, we resolve the above challenges by selectively deflecting the packets that cause persistent congestion in the network. To enable this, we augment the end-host network stacks with a transport-independent extension that tracks and marks flows with their remaining bytes. Our in-network deflection component uses the flow size information to re-route packets from flows with more data to send. Finally, an extension to the receive-side of end-host stacks retrieves the correct ordering of packets before passing them to transport and higherlevel protocols. We evaluate our design, Vertigo, under diverse datacenter workloads and show that it is effective in managing microbursts under light and heavy loads and when combined with various congestion control algorithms. For example, in a leaf-spine network under 85% load, Vertigo reduces the mean incast query completion times by 3.5x, 3.3x, 5x compared to ECMP, DRILL, and DIBS when using TCP, 3x, 3.5x, 4.5x alongside DCTCP, and 43x, 33x, 16x when using Swift, respectively.

Supplemental Material

3485983.3494873-presentation.mp4

Presentation video

References

  1. 2020. INET Framework. https://inet.omnetpp.org/.Google ScholarGoogle Scholar
  2. 2020. OMNeT++ Simulator. https://omnetpp.org/.Google ScholarGoogle Scholar
  3. 2020. Open Tofino. https://github.com/barefootnetworks/Open-Tofino.Google ScholarGoogle Scholar
  4. Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. 2008. A Scalable, Commodity Data Center Network Architecture. In SIGCOMM.Google ScholarGoogle Scholar
  5. Mohammad Alizadeh, Tom Edsall, Sarang Dharmapurikar, Ramanan Vaidyanathan, Kevin Chu, Andy Fingerhut, Vinh The Lam, Francis Matus, Rong Pan, Navindra Yadav, and George Varghese. 2014. CONGA: distributed congestion-aware load balancing for datacenters. In SIGCOMM.Google ScholarGoogle Scholar
  6. Mohammad Alizadeh, Albert Greenberg, David A Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data Center TCP (DCTCP). In SIGCOMM.Google ScholarGoogle Scholar
  7. Mohammad Alizadeh, Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, and Scott Shenker. 2013. pFabric: minimal near-optimal data-center transport. In SIGCOMM.Google ScholarGoogle Scholar
  8. Mark Allman and Ethan Blanton. 2005. Notes on burst mitigation for transport protocols. SIGCOMM CCR (2005).Google ScholarGoogle Scholar
  9. Behnaz Arzani, Selim Ciraci, Luiz Chamon, Yibo Zhu, Hongqiang Harry Liu, Jitu Padhye, Boon Thau Loo, and Geoff Outhred. 2018. 007: Democratically finding the cause of packet drops. In NSDI.Google ScholarGoogle Scholar
  10. Behnaz Arzani, Selim Ciraci, Boon Thau Loo, Assaf Schuster, and Geoff Outhred. 2016. Taking the Blame Game out of Data Centers Operations with NetPoirot. In SIGCOMM.Google ScholarGoogle Scholar
  11. Z. Abbasi G. Gibson B. Mueller J. Small J. Zelenka B. Welch, M. Unangst and B. Zhou. 2008. Scalable Performance of the Panasas Parallel File System. In FAST.Google ScholarGoogle Scholar
  12. Wei Bai, Li Chen, Kai Chen, and Haitao Wu. 2016. Enabling ECN in multi-service multi-queue data centers. In NSDI.Google ScholarGoogle Scholar
  13. Neda Beheshti, Petr Lapukhov, and Yashar Ganjali. 2019. Buffer Sizing Experiments at Facebook. In ACM BS.Google ScholarGoogle Scholar
  14. Ran Ben Basat, Sivaramakrishnan Ramanathan, Yuliang Li, Gianni Antichi, Minian Yu, and Michael Mitzenmacher. 2020. PINT: Probabilistic In-Band Network Telemetry. In SIGCOMM '20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Theophilus Benson, Aditya Akella, and David A Maltz. 2010. Network Traffic Characteristics of Data Centers in the Wild. In IMC.Google ScholarGoogle Scholar
  16. Steven Blake, David Black, Mark Carlson, Elwyn Davies, Zheng Wang, and Walter Weiss. 1998. An architecture for differentiated services. RFC 2475 (1998).Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Alberto Bononi, Fabrizio Forghieri, and Paul R Prucnal. 1993. Analysis of one-buffer deflection routing in ultra-fast optical mesh networks. In INFOCOM.Google ScholarGoogle Scholar
  18. Flaminio Borgonovo, Luigi Fratta, and Joseph Bannister. 1993. Unslotted deflection routing in all-optical networks. In GLOBECOM.Google ScholarGoogle Scholar
  19. Flaminio Borgonovo, Luigi Fratta, and Joseph A Bannister. 1994. On the design of optical deflection-routing networks. In INFOCOM.Google ScholarGoogle Scholar
  20. Xiaoqi Chen, Shir Landau Feibish, Yaron Koral, Jennifer Rexford, Ori Rottenstreich, Steven A Monetti, and Tzuu-Yi Wang. 2019. Fine-Grained Queue Measurement in the Data Plane. In CoNEXT.Google ScholarGoogle Scholar
  21. Yang Chen, Hongyi Wu, Dahai Xu, and Chunming Qiao. 2003. Performance analysis of optical burst switched node with deflection routing. In IEEE International Conference on Communications, Vol. 2.Google ScholarGoogle Scholar
  22. Inho Cho, Keon Jang, and Dongsu Han. 2017. Credit-Scheduled Delay-Bounded Congestion Control for Datacenters. In SIGCOMM.Google ScholarGoogle Scholar
  23. David D Clark, Scott Shenker, and Lixia Zhang. 1992. Supporting real-time applications in an integrated services packet network: Architecture and mechanism. In SIGCOMM CCR.Google ScholarGoogle Scholar
  24. Alan Demers, Srinivasan Keshav, and Scott Shenker. 1989. Analysis and simulation of a fair queueing algorithm. SIGCOMM CCR (1989).Google ScholarGoogle Scholar
  25. Dmitry Duplyakin, Robert Ricci, Aleksander Maricq, Gary Wong, Jonathon Duerig, Eric Eide, Leigh Stoller, Mike Hibler, David Johnson, Kirk Webb, Aditya Akella, Kuangching Wang, Glenn Ricart, Larry Landweber, Chip Elliott, Michael Zink, Emmanuel Cecchet, Snigdhaswin Kar, and Prabodh Mishra. 2019. The Design and Operation of CloudLab. In ATC.Google ScholarGoogle Scholar
  26. Chris Fallin, Greg Nazario, Xiangyao Yu, Kevin Chang, Rachata Ausavarungnirun, and Onur Mutlu. 2012. MinBD: Minimally-buffered deflection routing for energy-efficient interconnect. In IEEE/ACM International Symposium on Networks-on-Chip.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Bin Fan, Dave G Andersen, Michael Kaminsky, and Michael D Mitzenmacher. 2014. Cuckoo Filter: Practically Better Than Bloom. In CoNEXT.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sally Floyd, Andrei Gurtov, and Tom Henderson. 2004. The NewReno Modification to TCP's Fast Recovery Algorithm. RFC 3782.Google ScholarGoogle Scholar
  29. S. Floyd and V. Jacobson. 1994. The synchronization of periodic routing messages. IEEE/ACM Transactions on Networking (1994).Google ScholarGoogle Scholar
  30. Peter X Gao, Akshay Narayan, Gautam Kumar, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. 2015. pHost: distributed near-optimal datacenter transport over commodity network fabric. In CoNEXT.Google ScholarGoogle Scholar
  31. Yilong Geng, Vimalkumar Jeyakumar, Abdul Kabbani, and Mohammad Alizadeh. 2016. Juggler: a practical reordering resilient network stack for datacenters. In EuroSys.Google ScholarGoogle Scholar
  32. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google file system. In SOSP.Google ScholarGoogle Scholar
  33. Soudeh Ghorbani, Zibin Yang, P Brighten Godfrey, Yashar Ganjali, and Amin Firoozshahian. 2017. DRILL: Micro Load Balancing for Low-latency Data Center Networks. In SIGCOMM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Soroush Haeri and Ljiljana Trajković. 2014. Intelligent deflection routing in buffer-less networks. IEEE Transactions on Cybernetics 45, 2 (2014).Google ScholarGoogle Scholar
  35. Mark Handley, Costin Raiciu, Alexandru Agache, Andrei Voinescu, Andrew W Moore, Gianni Antichi, and Marcin Wójcik. 2017. Re-architecting datacenter networks and stacks for low latency and high performance. In SIGCOMM.Google ScholarGoogle Scholar
  36. Keqiang He, Eric Rozner, Kanak Agarwal, Wes Felter, John Carter, and Aditya Akella. 2015. Presto: Edge-based Load Balancing for Fast Datacenter Networks. In SIGCOMM.Google ScholarGoogle Scholar
  37. Chi-Yao Hong, Matthew Caesar, and P Brighten Godfrey. 2012. Finishing flows quickly with preemptive scheduling. In SIGCOMM.Google ScholarGoogle Scholar
  38. Ching-Fang Hsu, Te-Lung Liu, and Nen-Fu Huang. 2002. Performance analysis of deflection routing in optical burst-switched networks. In Annual Joint Conference of the IEEE Computer and Communications Societies, Vol. 1.Google ScholarGoogle Scholar
  39. Shuihai Hu, Wei Bai, Gaoxiong Zeng, Zilong Wang, Baochen Qiao, Kai Chen, Kun Tan, and Yi Wang. 2020. Aeolus: A Building Block for Proactive Transport in Datacenters. In SIGCOMM.Google ScholarGoogle Scholar
  40. Hao Jiang and Constantinos Dovrolis. 2003. Source-Level IP Packet Bursts: Causes and Effects. In IMC.Google ScholarGoogle Scholar
  41. Hao Jiang and Constantinos Dovrolis. 2005. Why is the Internet Traffic Bursty in Short Time Scales? SIGMETRICS Perform. Eval. Rev. (2005).Google ScholarGoogle Scholar
  42. Raj Joshi, Ting Qu, Mun Choon Chan, Ben Leong, and Boon Thau Loo. 2018. BurstRadar: Practical Real-Time Microburst Monitoring for Datacenter Networks. In APSys.Google ScholarGoogle Scholar
  43. Srikanth Kandula, Sudipta Sengupta, Albert Greenberg, Parveen Patel, and Ronnie Chaiken. 2009. The Nature of Data Center Traffic: Measurements & Analysis. In IMC.Google ScholarGoogle Scholar
  44. Rishi Kapoor, Alex C Snoeren, Geoffrey M Voelker, and George Porter. 2013. Bullet trains: a study of NIC burst behavior at microsecond timescales. In CoNEXT.Google ScholarGoogle Scholar
  45. Kazuki Kawanabe and Tatsuro Takahashi. 2007. Effective deflection control method in optical packet switching networks with shared buffers. Electronics and Communications in Japan (Part I: Communications) 90, 9 (2007).Google ScholarGoogle Scholar
  46. Changhoon Kim, Anirudh Sivaraman, Naga Katta, Antonin Bas, Advait Dixit, and Lawrence J Wobker. 2015. In-band network telemetry via programmable dataplanes. In SIGCOMM.Google ScholarGoogle Scholar
  47. Gautam Kumar, Nandita Dukkipati, Keon Jang, Hassan M G Wassel, Xian Wu, Behnam Montazeri, Yaogong Wang, Kevin Springborn, Christopher Alfeld, Michael Ryan, David Wetherall, and Amin Vahdat. 2020. Swift: Delay is Simple and Effective for Congestion Control in the Datacenter. In SIGCOMM.Google ScholarGoogle Scholar
  48. Yuliang Li, Rui Miao, Hongqiang Harry Liu, Yan Zhuang, Fei Feng, Lingbo Tang, Zheng Cao, Ming Zhang, Frank Kelly, Mohammad Alizadeh, and Minlan Yu. 2019. HPCC: high precision congestion control. In SIGCOMM.Google ScholarGoogle Scholar
  49. Hwijoon Lim, Wei Bai, Yibo Zhu, Youngmok Jung, and Dongsu Han. 2021. Towards timeout-less transport in commodity datacenter networks. In EuroSys.Google ScholarGoogle Scholar
  50. Zhonghai Lu, Mingchen Zhong, and Axel Jantsch. 2006. Evaluation of on-chip networks using deflection routing. In ACM Great Lakes symposium on VLSI.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Srihari Makineni, Ravi Iyer, Partha Sarangam, Donald Newell, Li Zhao, Ramesh Illikkal, and Jaideep Moses. 2006. Receive Side Coalescing for Accelerating TCP/IP Processing. In HiPC.Google ScholarGoogle Scholar
  52. Jonatas Marques, Kirill Levchenko, and Luciano Gaspary. 2020. IntSight: Diagnosing SLO Violations with in-Band Network Telemetry. In CoNEXT.Google ScholarGoogle Scholar
  53. Michael Marty, Marc de Kruijf, Jacob Adriaens, Christopher Alfeld, Sean Bauer, Carlo Contavalli, Michael Dalton, Nandita Dukkipati, William C Evans, Steve Gribble, Nicholas Kidd, Roman Kononov, Gautam Kumar, Carl Mauer, Emily Musick, Lena Olson, Erik Rubow, Michael Ryan, Kevin Springborn, Paul Turner, Valas Valancius, Xi Wang, and Amin Vahdat. 2019. Snap: A Microkernel Approach to Host Networking. In SOSP.Google ScholarGoogle Scholar
  54. Radhika Mittal, Vinh The Lam, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi, Amin Vahdat, Yaogong Wang, David Wetherall, and David Zats. 2015. TIMELY: RTT-based Congestion Control for the Datacenter. In SIGCOMM.Google ScholarGoogle Scholar
  55. Michael Mitzenmacher, AndrÃl'a W. Richa, and Ramesh Sitaraman. 2000. The Power of Two Random Choices: A Survey of Techniques and Results. In Handbook of Randomized Computing.Google ScholarGoogle Scholar
  56. Behnam Montazeri, Yilong Li, Mohammad Alizadeh, and John Ousterhout. 2018. Homa: A Receiver-driven Low-latency Transport Protocol Using Network Priorities. In SIGCOMM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Ali Munir, Ghufran Baig, Syed M Irteza, Ihsan A Qazi, Alex X Liu, and Fahad R Dogar. 2014. Friends, not foes: synthesizing existing transport strategies for data center networks. In SIGCOMM.Google ScholarGoogle Scholar
  58. Aisha Mushtaq, Radhika Mittal, James McCauley, Mohammad Alizadeh, Sylvia Ratnasamy, and Scott Shenker. 2019. Datacenter congestion control: identifying what is essential and making it practical. SIGCOMM CCR (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. S Narayana, A Sivaraman, V Nathan, P Goyal, and others. 2017. Language-directed hardware design for network performance monitoring. In SIGCOMM.Google ScholarGoogle Scholar
  60. Abhay K Parekh and Robert G Gallager. 1993. A generalized processor sharing approach to flow control in integrated services networks: the single-node case. IEEE/ACM transactions on networking (1993).Google ScholarGoogle Scholar
  61. Jonathan Perry, Amy Ousterhout, Hari Balakrishnan, Devavrat Shah, and Hans Fugal. 2014. Fastpass: A centralized" zero-queue" datacenter network. In SIGCOMM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, and Alex C Snoeren. 2015. Inside the Social Network's (Datacenter) Network. In SIGCOMM.Google ScholarGoogle Scholar
  63. D. Shan, F. Ren, P. Cheng, R. Shu, and C. Guo. 2018. Micro-Burst in Data Centers: Observations, Analysis, and Mitigations. In IEEE ICNP.Google ScholarGoogle Scholar
  64. Naveen Kr Sharma, Chenxingyu Zhao, Ming Liu, Pravein G Kannan, Changhoon Kim, Arvind Krishnamurthy, and Anirudh Sivaraman. 2020. Programmable calendar queues for high-speed packet scheduling. In NSDI.Google ScholarGoogle Scholar
  65. X. Shi, L. Wang, F. Zhang, K. Zheng, and Z. Liu. 2017. PABO: Congestion mitigation via packet bounce. In IEEE ICC.Google ScholarGoogle Scholar
  66. Madhavapeddi Shreedhar and George Varghese. 1995. Efficient fair queueing using deficit round robin. SIGCOMM CCR.Google ScholarGoogle Scholar
  67. Vishal Shrivastav. 2019. Fast, scalable, and programmable packet scheduler in hardware. In SIGCOMM.Google ScholarGoogle Scholar
  68. Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armistead, Roy Bannon, Seb Boving, Gaurav Desai, Bob Felderman, Paulie Germano, Anand Kanagala, Jeff Provost, Jason Simmons, Eiichi Tanda, Jim Wanderer, Urs Hölzle, Stephen Stuart, and Amin Vahdat. 2015. Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network. In SIGCOMM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Anirudh Sivaraman, Suvinay Subramanian, Mohammad Alizadeh, Sharad Chole, Shang-Tse Chuang, Anurag Agrawal, Hari Balakrishnan, Tom Edsall, Sachin Katti, and Nick McKeown. 2016. Programmable Packet Scheduling at Line Rate. In SIGCOMM.Google ScholarGoogle Scholar
  70. Renata Teixeira, Aman Shaikh, Tim Griffin, and Jennifer Rexford. 2004. Dynamics of hot-potato routing in IP networks. In International Conference on Measurement and Modeling of Computer Systems.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Vojislav Dukić, Sangeetha Abdu Jyothi, Bojan Karlaš, Muhsen Owaida, Ce Zhang, and Ankit Singla. 2019. Is advance knowledge of flow sizes a plausible assumption?. In NSDI.Google ScholarGoogle Scholar
  72. Erico Vanini, Rong Pan, Mohammad Alizadeh, Parvin Taheri, and Tom Edsall. 2017. Let it flow: Resilient asymmetric load balancing with flowlet switching. In NSDI.Google ScholarGoogle Scholar
  73. J Woodruff, A W Moore, and N Zilberman. 2019. Measuring Burstiness in Data Center Applications. In BS.Google ScholarGoogle Scholar
  74. Liangcheng Yu, John Sonchack, and Vincent Liu. 2020. Mantis: Reactive Programmable Switches. In SIGCOMM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Kyriakos Zarifis, Rui Miao, Matt Calder, Ethan Katz-Bassett, Minlan Yu, and Jitendra Padhye. 2014. DIBS: just-in-time congestion mitigation for data centers. In Eurosys.Google ScholarGoogle Scholar
  76. Qiao Zhang, Vincent Liu, Hongyi Zeng, and Arvind Krishnamurthy. 2017. Highresolution measurement of data center microbursts. In IMC.Google ScholarGoogle Scholar
  77. Yu Zhou, Chen Sun, Hongqiang Harry Liu, Rui Miao, Shi Bai, Bo Li, Zhilong Zheng, Lingjun Zhu, Zhen Shen, Yongqing Xi, Pengcheng Zhang, Dennis Cai, Ming Zhang, and Mingwei Xu. 2020. Flow Event Telemetry on Programmable Data Plane. In SIGCOMM.Google ScholarGoogle Scholar

Index Terms

  1. Burst-tolerant datacenter networks with Vertigo

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Article Metrics

      • Downloads (Last 12 months)201
      • Downloads (Last 6 weeks)201

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!