Abstract

The bandwidth and latency requirements of modern datacenter applications have led researchers to propose various topology designs using static, dynamic demand-oblivious (rotor), and/or dynamic demand-aware switches. However, given the diverse nature of datacenter traffic, there is little consensus about how these designs would fare against each other. In this work, we analyze the throughput of existing topology designs under different traffic patterns and study their unique advantages and potential costs in terms of bandwidth and latency ''tax''. To overcome the identified inefficiencies, we propose Cerberus, a unified, two-layer leaf-spine optical datacenter design with three topology types. Cerberus systematically matches different traffic patterns with their most suitable topology type: e.g., latency-sensitive flows are transmitted via a static topology, all-to-all traffic via a rotor topology, and elephant flows via a demand-aware topology. We show analytically and in simulations that Cerberus can improve throughput significantly compared to alternative approaches and operate datacenters at higher loads while being throughput-proportional.
- H. Ballani, P. Costa, R. Behrendt, D. Cletheroe, I. Haller, K. Jozwik, F. Karinou, S. Lange, K. Shi, B. Thomsen, et al., Sirius: A flat datacenter network with nanosecond optical switching," in Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, pp. 782--797, 2020.Google Scholar
- X. Zhou, Z. Zhang, Y. Zhu, Y. Li, S. Kumar, A. Vahdat, B. Y. Zhao, and H. Zheng, Mirror mirror on the ceiling: Flexible wireless links for data centers," Proc. ACM SIGCOMM Computer Communication Review (CCR), vol. 42, no. 4, pp. 443--454, 2012.Google Scholar
- S. Kandula, J. Padhye, and P. Bahl, Flyways to de-congest data center networks," in Proc. ACM Workshop on Hot Topics in Networks (HotNets), 2009.Google Scholar
- W. M. Mellette, R. McGuinness, A. Roy, A. Forencich, G. Papen, A. C. Snoeren, and G. Porter, Rotornet: A scalable, low-complexity, optical datacenter network," in Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pp. 267--280, ACM, 2017.Google Scholar
- W. M. Mellette, R. Das, Y. Guo, R. McGuinness, A. C. Snoeren, and G. Porter, Expanding across time to deliver bandwidth efficiency and low latency," in 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), pp. 1--18, 2020.Google Scholar
- N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat, Helios: a hybrid electrical/optical switch architecture for modular data centers," ACM SIGCOMM Computer Communication Review, vol. 41, no. 4, pp. 339--350, 2011.Google Scholar
- N. Hamedazimi, Z. Qazi, H. Gupta, V. Sekar, S. R. Das, J. P. Longtin, H. Shah, and A. Tanwer, Firefly: A reconfigurable wireless data center fabric using free-space optics," in ACM SIGCOMM Computer Communication Review, vol. 44, pp. 319--330, ACM, 2014.Google Scholar
Digital Library
- L. Chen, K. Chen, Z. Zhu, M. Yu, G. Porter, C. Qiao, and S. Zhong, Enabling wide-spread communications on optical fabric with megaswitch," in Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation, NSDI'17, (USA), pp. 577--593, USENIX Association, 2017.Google Scholar
- Y. J. Liu, P. X. Gao, B. Wong, and S. Keshav, Quartz: A new design element for low-latency dcns," SIGCOMM Comput. Commun. Rev., vol. 44, pp. 283--294, Aug. 2014.Google Scholar
- K. Chen, A. Singla, A. Singh, K. Ramachandran, L. Xu, Y. Zhang, X. Wen, and Y. Chen, Osa: An optical switching architecture for data center networks with unprecedented flexibility," IEEE/ACM Transactions on Networking (TON), vol. 22, no. 2, pp. 498--511, 2014.Google Scholar
- M. Ghobadi, R. Mahajan, A. Phanishayee, N. Devanur, J. Kulkarni, G. Ranade, P.-A. Blanche, H. Rastegarfar, M. Glick, and D. Kilper, Projector: Agile reconfigurable data center interconnect," in Proceedings of the 2016 ACM SIGCOMM Conference, pp. 216--229, ACM, 2016.Google Scholar
- G. Wang, D. G. Andersen, M. Kaminsky, K. Papagiannaki, T. Ng, M. Kozuch, and M. Ryan, c-through: Part-time optics in data centers," ACM SIGCOMM Computer Communication Review, vol. 41, no. 4, pp. 327--338, 2011.Google Scholar
- S. Schmid, C. Avin, C. Scheideler, M. Borokhovich, B. Haeupler, and Z. Lotker, Splaynet: Towards locally self-adjusting networks," IEEE/ACM Transactions on Networking (ToN), vol. 24, no. 3, pp. 1421--1433, 2016.Google Scholar
- S. B. Venkatakrishnan, M. Alizadeh, and P. Viswanath, Costly circuits, submodular schedules and approximate carathéodory theorems," Queueing Systems, vol. 88, no. 3--4, pp. 311--347, 2018.Google Scholar
- R. Schwartz, M. Singh, and S. Yazdanbod, Online and offline greedy algorithms for routing with switching costs," arXiv preprint arXiv:1905.02800, 2019.Google Scholar
- A. Singla, A. Singh, K. Ramachandran, L. Xu, and Y. Zhang, Proteus: a topology malleable data center network," in Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks, p. 8, ACM, 2010.Google Scholar
- K. Chen, A. Singla, A. Singh, K. Ramachandran, L. Xu, Y. Zhang, X. Wen, and Y. Chen, Osa: An optical switching architecture for data center networks with unprecedented flexibility," IEEE/ACM Transactions on Networking, vol. 22, pp. 498--511, April 2014.Google Scholar
- M. Hampson, Reconfigurable optical networks will move supercomputerdata 100x faster," in IEEE Spectrum, 2021.Google Scholar
- F. Douglis, S. Robertson, E. Van den Berg, J. Micallef, M. Pucci, A. Aiken, M. Hattink, M. Seok, and K. Bergman, Fleet-fast lanes for expedited execution at 10 terabits: Program overview," IEEE Internet Computing, 2021.Google Scholar
- C. Avin, M. Ghobadi, C. Griner, and S. Schmid, On the complexity of traffic traces and implications," in Proc. ACM SIGMETRICS, 2020.Google Scholar
Digital Library
- T. Benson, A. Akella, and D. A. Maltz, Network traffic characteristics of data centers in the wild," in Proceedings of the 10th ACM SIGCOMM conference on Internet measurement, pp. 267--280, ACM, 2010.Google Scholar
- A. Roy, H. Zeng, J. Bagga, G. Porter, and A. C. Snoeren, Inside the social network's (datacenter) network," in ACM SIGCOMM Computer Communication Review, vol. 45, pp. 123--137, ACM, 2015.Google Scholar
Digital Library
- A. Roy, H. Zeng, J. Bagga, G. Porter, and A. C. Snoeren, Inside the social network's (datacenter) network," in Proc. ACM SIGCOMM Computer Communication Review (CCR), vol. 45, pp. 123--137, ACM, 2015.Google Scholar
- M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan, Data Center TCP (DCTCP)," in SIGCOMM, 2010.Google Scholar
Digital Library
- A. Sergeev and M. D. Balso, Horovod: fast and easy distributed deep learning in tensorflow," CoRR, vol. abs/1802.05799, 2018.Google Scholar
- A. Faraj, P. Patarasuk, and X. Yuan, A study of process arrival patterns for mpi collective operations," International Journal of Parallel Programming, vol. 36, no. 6, pp. 543--570, 2008.Google Scholar
- C. Yang, Tree-based allreduce communication on mxnet," 2018.Google Scholar
- A. Singla, Fat-free topologies," in Proc. 15th ACM Workshop on Hot Topics in Networks (HotNets), pp. 64--70, 2016.Google Scholar
Digital Library
- M. N. Hall, K.-T. Foerster, S. Schmid, and R. Durairajan, A survey of reconfigurable optical networks," in Optical Switching and Networking (OSN), Elsevier, 2021.Google Scholar
- S. Kassing, A. Valadarsky, G. Shahaf, M. Schapira, and A. Singla, Beyond fat-trees without antennae, mirrors, and disco-balls," in Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pp. 281--294, ACM, 2017.Google Scholar
- M. Al-Fares, A. Loukissas, and A. Vahdat, A scalable, commodity data center network architecture," in ACM SIGCOMM Computer Communication Review, vol. 38, pp. 63--74, ACM, 2008.Google Scholar
Digital Library
- J. Kim, W. J. Dally, S. Scott, and D. Abts, Technology-driven, highly-scalable dragonfly topology," in 2008 International Symposium on Computer Architecture, pp. 77--88, IEEE, 2008.Google Scholar
- C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu, Bcube: a high performance, server-centric network architecture for modular data centers," ACM SIGCOMM Computer Communication Review, vol. 39, no. 4, pp. 63--74, 2009.Google Scholar
- H. Wu, G. Lu, D. Li, C. Guo, and Y. Zhang, Mdcube: a high performance network structure for modular data center interconnection," in Proceedings of the 5th international conference on Emerging networking experiments and technologies, pp. 25--36, ACM, 2009.Google Scholar
- A. Singla, C.-Y. Hong, L. Popa, and P. B. Godfrey, Jellyfish: Networking data centers, randomly.," in Proc. USENIX Symposium on Networked Systems Design and Implementation (NSDI), vol. 12, pp. 17--17, 2012.Google Scholar
- A. Singh, J. Ong, A. Agarwal, G. Anderson, A. Armistead, R. Bannon, S. Boving, G. Desai, B. Felderman, P. Germano, et al., Jupiter rising: A decade of clos topologies and centralized control in google's datacenter network," ACM SIGCOMM computer communication review, vol. 45, no. 4, pp. 183--197, 2015.Google Scholar
Digital Library
- V. Liu, D. Halperin, A. Krishnamurthy, and T. Anderson, F10: A fault-tolerant engineered network," in Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pp. 399--412, 2013.Google Scholar
- C. Avin, K. Mondal, and S. Schmid, Demand-aware network designs of bounded degree," in Proc. International Symposium on Distributed Computing (DISC), 2017.Google Scholar
- M. Y. Teh, Z. Wu, and K. Bergman, Flexspander: augmenting expander networks in high-performance systems with optical bandwidth steering," IEEE/OSA Journal of Optical Communications and Networking, vol. 12, no. 4, pp. B44--B54, 2020.Google Scholar
- S. A. Jyothi, A. Singla, P. B. Godfrey, and A. Kolla, Measuring and understanding throughput of network topologies," in SC'16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 761--772, IEEE, 2016.Google Scholar
- M. Khani, M. Ghobadi, M. Alizadeh, Z. Zhu, M. Glick, K. Bergman, A. Vahdat, B. Klenk, and E. Ebrahimi, SiP-ML: High-Bandwidth Optical Network Interconnects for Machine Learning Training," SIGCOMM, 2021.Google Scholar
Digital Library
- A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta, Vl2: a scalable and flexible data center network," in Proceedings of the ACM SIGCOMM 2009 conference on Data communication, pp. 51--62, 2009.Google Scholar
- MEMS-Optical-Switches. http://www.diconfiber.com/products/mems_matrix_optical_switches.php.Google Scholar
- Edge 64 Optical Circuit Switch. " https://www.calient.net/products/edge640-optical-circuit-switch/.Google Scholar
- P. Namyar, S. Supittayapornpong, M. Zhang, M. Yu, and R. Govindan, A throughput-centric view of the performance of datacenter topologies," in To appear in Proceedings of the ACM SIGCOMM 2021 conference, 2021.Google Scholar
- M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan, "Data center tcp (dctcp)," in Proceedings of the ACM SIGCOMM 2010 conference, pp. 63--74, 2010.Google Scholar
- A. Singla, P. B. Godfrey, and A. Kolla, "High throughput data center topology design.," in Proc. USENIX Symposium on Networked Systems Design and Implementation (NSDI), pp. 29--41, 2014.Google Scholar
- N. Jain, A. Bhatele, X. Ni, N. J. Wright, and L. V. Kale, "Maximizing throughput on a dragonfly network," in SC'14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 336--347, IEEE, 2014.Google Scholar
- Y. S. Fainman, J. Ford, W. M. Mellette, S. M. G. Porter, A. C. Snoeren, G. Papen, S. Saeedi, J. Cunningham, A. Krishnamoorthy, M. Gehl, C. T. DeRose, P. S. Davids, D. C. Trotter, A. L. Starbuck, C. M. Dallo, D. Hood, A. Pomerene, and A. Lentine, "Leed: A lightwave energy-efficient datacenter," in 2019 Optical Fiber Communications Conference and Exhibition (OFC), pp. 1--3, 2019.Google Scholar
- Y. Ben-Itzhak, C. Caba, L. Schour, and S. Vargaftik, "C-share: Optical circuits sharing for software-defined data-centers," arXiv preprint arXiv:1609.04521, 2016.Google Scholar
- L. Chen, J. Lingys, K. Chen, and F. Liu, "Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization," in Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pp. 191--205, 2018.Google Scholar
- W. Bai, L. Chen, K. Chen, D. Han, C. Tian, and H. Wang, "Information-agnostic flow scheduling for commodity data centers," in 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15), pp. 455--468, 2015.Google Scholar
- J. S. Rosenthal, "Convergence rates for markov chains," Siam Review, vol. 37, no. 3, pp. 387--405, 1995.Google Scholar
Digital Library
- H. Ballani, P. Costa, R. Behrendt, D. Cletheroe, I. Haller, K. Jozwik, F. Karinou, S. Lange, K. Shi, B. Thomsen, et al., "Sirius: A flat datacenter network with nanosecond optical switching," in Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, pp. 782--797, 2020.Google Scholar
- L. G. Valiant, "A scheme for fast parallel communication," SIAM journal on computing, vol. 11, no. 2, pp. 350--361, 1982.Google Scholar
- X. S. Huang, X. S. Sun, and T. E. Ng, "Sunflow: Efficient optical circuit scheduling for coflows," in Proceedings of the 12th International on Conference on emerging Networking EXperiments and Technologies, pp. 297--311, 2016.Google Scholar
- O. Goldreich, "Basic facts about expander graphs," in Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation, pp. 451--464, Springer, 2011.Google Scholar
- T. J. Seok, N. Quack, S. Han, R. S. Muller, and M. C. Wu, "Large-scale broadband digital silicon photonic switches with vertical adiabatic couplers," Optica, vol. 3, pp. 64--70, Jan 2016.Google Scholar
Cross Ref
- S. Kandula, S. Sengupta, A. Greenberg, P. Patel, and R. Chaiken, "The nature of data center traffic: measurements & analysis," in Proc. 9th ACM SIGCOMM conference on Internet measurement, pp. 202--208, 2009.Google Scholar
Digital Library
- J. C. Mogul and L. Popa, "What we talk about when we talk about cloud network performance," ACM SIGCOMM Computer Communication Review, vol. 42, no. 5, pp. 44--48, 2012.Google Scholar
Digital Library
- S. Zou, X. Wen, K. Chen, S. Huang, Y. Chen, Y. Liu, Y. Xia, and C. Hu, "Virtualknotter: Online virtual machine shuffling for congestion resolving in virtualized datacenter," Computer Networks, vol. 67, pp. 141--153, 2014.Google Scholar
Cross Ref
- Q. Zhang, V. Liu, H. Zeng, and A. Krishnamurthy, "High-resolution measurement of data center microbursts," in Proceedings of the 2017 Internet Measurement Conference, IMC '17, (New York, NY, USA), pp. 78--85, ACM, 2017.Google Scholar
Digital Library
- J. Kulkarni, S. Schmid, and P. Schmidt, "Scheduling opportunistic links in two-tiered reconfigurable datacenters," in 33rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 2021.Google Scholar
- G. Porter, R. Strong, N. Farrington, A. Forencich, P. Chen-Sun, T. Rosing, Y. Fainman, G. Papen, and A. Vahdat, "Integrating microsecond circuit switching into the data center," in Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, SIGCOMM '13, (New York, NY, USA), pp. 447--458, Association for Computing Machinery, 2013.Google Scholar
Digital Library
- C. Avin and S. Schmid, "Renets: Statically-optimal demand-aware networks," in Proc. SIAM Symposium on Algorithmic Principles of Computer Systems (APOCS), 2021.Google Scholar
Cross Ref
- N. McKeown, "The islip scheduling algorithm for input-queued switches," IEEE/ACM transactions on networking, vol. 7, no. 2, pp. 188--201, 1999.Google Scholar
- M. Dinitz and B. Moseley, "Scheduling for weighted flow and completion times in reconfigurable networks," in IEEE Conference on Computer Communications (INFOCOM), pp. 1043--1052, 2020.Google Scholar
- M. Bienkowski, D. Fuchssteiner, J. Marcinkowski, and S. Schmid, "Online dynamic b-matching with applications to reconfigurable datacenter networks," in Proc. 38th International Symposium on Computer Performance, Modeling, Measurements and Evaluation (PERFORMANCE), 2020.Google Scholar
- S. Hoory, N. Linial, and A. Wigderson, "Expander graphs and their applications," Bulletin of the American Mathematical Society, vol. 43, no. 4, pp. 439--561, 2006.Google Scholar
Cross Ref
Index Terms
Cerberus: The Power of Choices in Datacenter Topology Design - A Throughput Perspective
Recommendations
Cerberus: The Power of Choices in Datacenter Topology Design - A Throughput Perspective
SIGMETRICS '22The bandwidth and latency requirements of modern datacenter applications have led researchers to propose various topology designs using static, dynamic demand-oblivious (rotor), and/or dynamic demand-aware switches. However, given the diverse nature of ...
Cerberus: The Power of Choices in Datacenter Topology Design - A Throughput Perspective
SIGMETRICS/PERFORMANCE '22: Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer SystemsThe bandwidth and latency requirements of modern datacenter applications have led researchers to propose various topology designs using static, dynamic demand-oblivious (rotor), and/or dynamic demand-aware switches. However, given the diverse nature of ...
Duo: A High-Throughput Reconfigurable Datacenter Network Using Local Routing and Control
POMACSThe performance of many cloud-based applications critically depends on the capacity of the underlying datacenter network. A particularly innovative approach to improve the throughput in datacenters is enabled by emerging optical technologies, which ...






Comments