skip to main content
research-article

Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability

Published:19 March 2018Publication History
Skip Abstract Section

Abstract

Emerging chips with hundreds and thousands of cores require networks with unprecedented energy/area efficiency and scalability. To address this, we propose Slim NoC (SN): a new on-chip network design that delivers significant improvements in efficiency and scalability compared to the state-of-the-art. The key idea is to use two concepts from graph and number theory, degree-diameter graphs combined with non-prime finite fields, to enable the smallest number of ports for a given core count. SN is inspired by state-of-the-art off-chip topologies; it identifies and distills their advantages for NoC settings while solving several key issues that lead to significant overheads on-chip. SN provides NoC-specific layouts, which further enhance area/energy efficiency. We show how to augment SN with state-of-the-art router microarchitecture schemes such as Elastic Links, to make the network even more scalable and efficient. Our extensive experimental evaluations show that SN outperforms both traditional low-radix topologies (e.g., meshes and tori) and modern high-radix networks (e.g., various Flattened Butterflies) in area, latency, throughput, and static/dynamic power consumption for both synthetic and real workloads. SN provides a promising direction in scalable and energy-efficient NoC topologies.

References

  1. N. Abeyratne, R. Das, Q. Li, K. Sewell, B. Giridhar, R. G. Dreslinski, D. Blaauw, and T. Mudge. Scaling Towards Kilo-Core Processors with Asymmetric High-Radix Topologies. HPCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. Agerwala, J. Martin, J. Mirza, D. Sadler, D. Dias, and M. Snir. SP2 System Architecture. IBM Systems Journal, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Ahn, S. Hong, S. Yoo, O. Mutlu, and K. Choi. A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing. ISCA, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. H. Ahn, N. Binkert, A. Davis, M. McLaren, and R. S. Schreiber. HyperX: Topology, Routing, and Packaging of Efficient Large-Scale Networks. SC, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. H. Ahn, Y. H. Son, and J. Kim. Scalable High-Radix Router Microarchitecture Using a Network Switch Organization. ACM TACO, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Alverson, D. Roweth, and L. Kaplan. The Gemini System Interconnect. HOTI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. H. Loh, and O. Mutlu. Design and Evaluation of Hierarchical Rings with Deflection Routing. SBAC-PAD, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Ausavarungnirun, C. Fallin, X. Yu, K. Chang, G. Nazario, R. Das, G. H. Loh, and O. Mutlu. A Case for Hierarchical Rings with Deflection Routing. PARCO, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Balfour and W. J. Dally. Design Tradeoffs for Tiled CMP On-Chip Networks. ICS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Besta, S. M. Hassan, S. Yalamanchili, R. Ausavarungnirun, O. Mutlu, and T. Hoefler. Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy-Efficiency and Scalability. Technical report, 2017.Google ScholarGoogle Scholar
  11. M. Besta and T. Hoefler. Slim Fly: A Cost Effective Low-Diameter Network Topology. SC, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Cai, K. Mai, and O. Mutlu. Comparative Evaluation of FPGA and ASIC Implementations of Bufferless and Buffered Routing Algorithms for On-Chip Networks. ISQED, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  13. A. Ceyhan, M. Jung, S. Panth, S. K. Lim, and A. Naeemi. Impact of Size Effects in Local Interconnects for Future Technology Nodes: A Study Based on Full-Chip Layouts. IITC/AMC, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  14. K. K.-W. Chang, R. Ausavarungnirun, C. Fallin, and O. Mutlu. HAT: Heterogeneous Adaptive Throttling for On-Chip Networks. SBAC-PAD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C.-H. O. Chen, S. Park, T. Krishna, S. Subramanian, A. P. Chandrakasan, and L.-S. Peh. SMART: A Single-Cycle Reconfigurable NoC for SoC Applications. DATE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Chen and T. M. Pinkston. Worm-bubble flow control. HPCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Chen, R. Wang, and T. Pinkston. Critical Bubble Scheme: An Efficient Implementation of Globally Aware Network Flow Control. IPDPS, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Craik and O. Mutlu. Investigating the Viability of Bufferless NoCs in Modern Chip Multi-Processor Systems. Carnegie Mellon University Safari Technical Report, 2011.Google ScholarGoogle Scholar
  19. W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers Inc., 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Das, S. Eachempati, A. Mishra, V. Narayanan, and C. Das. Design and Evaluation of a Hierarchical On-Chip Interconnect for Next-Generation CMPs. HPCA, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  21. R. Das, O. Mutlu, T. Moscibroda, and C. Das. Application-Aware Prioritization Mechanisms for On-Chip Networks. MICRO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Das, O. Mutlu, T. Moscibroda, and C. R. Das. Aérgia: Exploiting Packet Latency Slack in On-Chip Networks. In ISCA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. J. Dongarra, C. B. Moler, J. R. Bunch, and G. W. Stewart. LINPACK Users' Guide. SIAM, 1979.Google ScholarGoogle Scholar
  24. EZchip Semiconductor Ltd. EZchip Introduces TILE-Mx100 World's Highest Core-Count ARM Processor Optimized for High-Performance Networking Applications. http://www.tilera.com/News/PressRelease/?ezchip=97, 2015.Google ScholarGoogle Scholar
  25. C. Fallin, C. Craik, and O. Mutlu. CHIPPER: A Low-Complexity Bufferless Deflection Router. HPCA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu. MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect. NOCS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu. Bufferless and Minimally-Buffered Deflection Routing. Routing Algorithms in Networks-on-Chip, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  28. H. Fu, J. Liao, J. Yang, L. Wang, Z. Song, X. Huang, C. Yang, W. Xue, F. Liu, F. Qiao, et al. The Sunway TaihuLight Supercomputer: System and Applications. Science China Information Sciences, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  29. B. Grot, J. Hestness, S. Keckler, and O. Mutlu. Express Cube Topologies for On-Chip Interconnects. HPCA, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  30. B. Grot, J. Hestness, S. Keckler, and O. Mutlu. Kilo-NoC: A Heterogeneous Network-on-Chip Architecture for Scalability and Service Guarantees. ISCA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. Hassan and S. Yalamanchili. Centralized Buffer Router: A Low Latency, Low Power Router for High Radix NoCs. NOCS, 2013.Google ScholarGoogle Scholar
  32. S. Hassan and S. Yalamanchili. Bubble Sharing: Area and Energy Efficient Adaptive Routers using Centralized Buffers. NOCS, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  33. IBM ILOG. User's Manual for CPLEX, v12.1. International Business Machines Corporation, 2009.Google ScholarGoogle Scholar
  34. A. Jain, R. Parikh, and V. Bertacco. High-Radix On-Chip Networks with Low-Radix Routers. ICCAD, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. N. Jiang, G. Michelogiannakis, D. Becker, B. Towles, and W. J. Dally. Booksim 2.0 User's Guide. Standford University, 2010.Google ScholarGoogle Scholar
  36. Y.-H. Kao, M. Yang, N. S. Artan, and H. J. Chao. CNoC: High-Radix Clos Network-on-Chip. TCAD, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Kim. Low-Cost Router Microarchitecture for On-Chip Networks. MICRO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. Kim, W. J. Dally, and D. Abts. Flattened Butterfly: A Cost-Efficient Topology for High-Radix Networks. ISCA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Kim, W. J. Dally, S. Scott, and D. Abts. Technology-Driven, Highly-Scalable Dragonfly Topology. ISCA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. A. K. Kodi, A. Sarathy, and A. Louri. iDEAL: Inter-Router Dual-Function Energy and Area-Efficient Links for Network-on-Chip (NoC) Architectures. ISCA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. A. Kumar, L.-S. Peh, P. Kundu, and N. Jha. Toward Ideal On-Chip Communication Using Express Virtual Channels. IEEE Micro, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. C. E. Leiserson. Fat-Trees: Universal Networks for Hardware-Efficient Supercomputing. IEEE TC, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. J. Liu and J. G. Delgado-Frias. A DAMQ Shared Buffer Scheme for Network-on-Chip. CSS, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. R. Manevich, L. Polishuk, I. Cidon, and A. Kolodny. Designing Single-Cycle Long Links in Hierarchical NoCs. Microprocessors and Microsystems, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. B. D. McKay, M. Miller, and J. vSirán. A Note on Large Graphs of Diameter Two and Given Maximum Degree. Journal of Combinatorial Theory, Series B, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. G. Michelogiannakis, J. Balfour, and W. Dally. Elastic-Buffer Flow Control for On-Chip Networks. HPCA, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  47. T. Moscibroda and O. Mutlu. A Case for Bufferless Routing in On-Chip Networks. ISCA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. C. Nicopoulos, D. Park, J. Kim, N. Vijaykrishnan, M. S. Yousif, and C. R. Das. ViChaR: A Dynamic Virtual Channel Regulator for Network-on-Chip Routers. MICRO, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. G. Nychis, C. Fallin, T. Moscibroda, and O. Mutlu. Next Generation On-Chip Networks: What Kind of Congestion Control Do We Need? In HotNets, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. G. P. Nychis, C. Fallin, T. Moscibroda, O. Mutlu, and S. Seshan. On-Chip Networks from a Networking Perspective: Congestion and Scalability in Many-Core Interconnects. SIGCOMM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. A. Olofsson. Epiphany-V: A 1024 Processor 64-bit RISC System-on-Chip. arXiv preprint arXiv:1610.01832, 2016.Google ScholarGoogle Scholar
  52. Y. Pan, P. Kumar, J. Kim, G. Memik, Y. Zhang, and A. Choudhary. Firefly: Illuminating Future Network-on-Chip with Nanophotonics. ISCA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. L.-S. Peh and W. J. Dally. A Delay Model and Speculative Architecture for Pipelined Routers. HPCA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Pezy Computing. PEZY-SC2. http://pezy.jp.Google ScholarGoogle Scholar
  55. N. Pippenger and G. Lin. Fault-Tolerant Circuit-Switching Networks. SPAA, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. V. Puente, R. Beivide, J. Gregorio, J. Prellezo, J. Duato, and C. Izu. Adaptive Bubble Router: A Design to Improve Performance in Torus Networks. ICPP, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. R. Ramanujam, V. Soteriou, B. Lin, and L.-S. Peh. Design of a High-Throughput Distributed Shared-Buffer NoC Router. NOCS, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. P. Rosenfeld, E. Cooper-Balis, and B. Jacob. DRAMSim2: A Cycle Accurate Memory System Simulator. IEEE CAL, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. S. Scott, D. Abts, J. Kim, and W. J. Dally. The BlackWidow High-Radix Clos Network. ISCA, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. I. Seitanidis, A. Psarras, G. Dimitrakopoulos, and C. Nicopoulos. ElastiStore: An Elastic Buffer Architecture for Network-on-Chip Routers. DATE, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. K. Sewell, R. G. Dreslinski, T. Manville, S. Satpathy, N. Pinckney, G. Blake, M. Cieslak, R. Das, T. F. Wenisch, D. Sylvester, D. Blaauw, and T. Mudge. Swizzle-Switch Networks for Many-Core Systems. Emerging and Selected Topics in Circuits and Systems, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  62. A. Singh. Load-Balanced Routing in Interconnection Networks. PhD thesis, Stanford University, 2005.Google ScholarGoogle Scholar
  63. S. Skiena. Dijkstra's algorithm. Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Addison-Wesley, 1990.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. A. Sodani. Knights Landing (KNL): 2nd Generation Intel® Xeon Phi Processor. HCS, 2015.Google ScholarGoogle Scholar
  65. G. Strang. Introduction to Linear Algebra. Wellesley-Cambridge Press Wellesley, MA, 1993.Google ScholarGoogle Scholar
  66. C. Sun, C. O. Chen, G. Kurian, L. Wei, J. E. Miller, A. Agarwal, L. Peh, and V. Stojanovic. DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling. NOCS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Y. Tamir and G. Frazier. Dynamically-Allocated Multi-Queue Buffers for VLSI Communication Switches. IEEE TC, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. A. T. Tran and B. M. Baas. RoShaQ: High-Performance On-Chip Router with Shared Queues. ICCD, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. A. N. Udipi, N. Muralimanohar, and R. Balasubramonian. Towards Scalable, Energy-Efficient, Bus-Based On-Chip Networks. HPCA, 2010.Google ScholarGoogle Scholar
  70. J. Wang, J. Beu, R. Bheda, T. Conte, Z. Dong, C. Kersey, M. Rasquinha, G. Riley, W. Song, H. Xiao, P. Xu, and S. Yalamanchili. Manifold: A Parallel Simulation Framework for Multicore Systems. ISPASS, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  71. R. Wang, L. Chen, and T. M. Pinkston. Bubble Coloring: Avoiding Routing- and Protocol-Induced Deadlocks with Minimal Virtual Channel Requirement. ICS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. X. Xiang, S. Ghose, O. Mutlu, and N.-F. Tzeng. A Model for Application Slowdown Estimation in On-Chip Networks and Its Use for Improving System Fairness and Performance. ICCD, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  73. X. Xiang, W. Shi, S. Ghose, L. Peng, O. Mutlu, and N.-F. Tzeng. Carpool: A Bufferless On-Chip Network Supporting Adaptive Multicast and Hotspot Alleviation. ICS, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Y. Xu, Y. Du, B. Zhao, X. Zhou, Y. Zhang, and J. Yang. A Low-Radix and Low-Diameter 3D Interconnection Network Design. HPCA, 2009.Google ScholarGoogle Scholar
  75. H. Yang, J. Tripathi, N. E. Jerger, and D. Gibson. Dodec: Random-Link, Low-Radix On-Chip Networks. MICRO, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. X. Yuan. On Nonblocking Folded-Clos Networks in Computer Communication Environments. IPDPS, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 53, Issue 2
          ASPLOS '18
          February 2018
          809 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/3296957
          Issue’s Table of Contents
          • cover image ACM Conferences
            ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
            March 2018
            827 pages
            ISBN:9781450349116
            DOI:10.1145/3173162

          Copyright © 2018 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 March 2018

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!