Abstract
We can design an FPGA-optimized lightweight network-on-chip (NoC) router for flit-oriented packet-switched communication that is an order of magnitude smaller (in terms of LUTs and FFs) than state-of-the-art FPGA overlay routers available today. We present Hoplite, an efficient, lightweight, and fast FPGA overlay NoC that is designed to be small and compact by (1) using deflection routing instead of buffered switching to eliminate expensive FIFO buffers and (2) using a torus topology to reduce the cost of switch crossbar. Buffering and crossbar implementation complexities have traditionally limited speeds and imposed heavy resource costs in conventional FPGA overlay NoCs. We take care to exploit the fracturable lookup tables (LUT) organization of the FPGA to further improve the resource efficiency of mapping the expensive crossbar multiplexers. Hoplite can outperform classic, bidirectional, buffered mesh networks for single-flit-oriented FPGA applications by as much as 1.5 × (best achievable throughputs for a 10 × 10 system) or 2.5 × (allocating same amount of FPGA resources to both NoCs) for uniform random traffic. When compared to buffered mesh switches, FPGA-based deflection routers are ≈ 3.5 × smaller (HLS-generated switch) and 2.5 × faster (clock period) for 32b payloads. In a separate experiment, we hand-crafted an RTL version of our switch with location constraints that requires only 60 LUTs and 100 FFs per router and runs at 2.9ns. We conduct additional layout experiments on modern Xilinx and Altera FPGAs and demonstrate wide-channel chip-spanning layouts that run in excess of 300MHz while consuming 10--15% of overall chip resources. We also demonstrate a clustered RISC-V multiprocessor organization that uses Hoplite to help deliver the high processing throughputs of the FPGA architecture to user applications.
- P. Abad, P. Prieto, L. G. Menezo, A. Colaso, V. Puente, and J A Gregorio. 2012. TOPAZ: An open-source interconnection network simulator for chip multiprocessors and supercomputers. In Proceedings of the 2012 6th IEEE/ACM International Symposium on Networks on Chip (NoCS). 99--106. Google Scholar
Digital Library
- M. S. Abdelfattah and V. Betz. 2012. Design tradeoffs for hard and soft FPGA-based networks-on-chip. In Proceedings of the 2012 International Conference on Field-Programmable Technology (FPT). 95--103.Google Scholar
- Altera. 2011. Applying the Benefits of Network on a Chip Architecture to FPGA System Design. Altera White Paper. (Apr. 2011). Retrieved from https://www.altera.com/en_US/pdfs/literature/wp/wp-01149-noc-qsys.pdf.Google Scholar
- Altera Corp. 2015. Arria 10 Core Fabric and General Purpose I/Os Handbook. Retrieved May 2015 from https://www.altera.com/en_US/pdfs/literature/hb/arria-10/a10_handbook.pdf.Google Scholar
- Krste Asanović and David Patterson. 2014. Instruction sets should be free: the case for RISC-V. Technical Report No. UCB/EECS-2014-146. (Aug. 2014).Google Scholar
- Buchholz. 1992. Comments on CSMA. IEEE 802, 11 (1992), 802--11.Google Scholar
- Y. Cai, K. Mai, and O. Mutlu. 2015. Comparative evaluation of FPGA and ASIC implementations of bufferless and buffered routing algorithms for on-chip networks. In Proceedings of the 16th International Symposium on Quality Electronic Design. 475--484.Google Scholar
- W. J. Dally and B. Towles. 2001. Route packets, not wires: On-chip interconnection networks. In Proceedings of the Design Automation Conference, 2001. 684--689. Google Scholar
Digital Library
- C. Fallin, G. Nazario, X. Yu, K. Chang, R. Ausavarungnirun, and O. Mutlu. 2012. MinBD: Minimally-buffered deflection routing for energy-efficient interconnect. In 2012 Sixth IEEE/ACM International Symposium on Networks on Chip (NoCS). 1--10. Google Scholar
Digital Library
- J. Gray. 2014. Keynote 3 2014; The past and future of FPGA soft processors. In Proceedings of the 2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig). 1--1.Google Scholar
Cross Ref
- J. Gray. 2016. GRVI Phalanx: A massively parallel RISC-V FPGA accelerator accelerator. In 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 17--20.Google Scholar
Cross Ref
- Yutian Huan and A. DeHon. 2012. FPGA optimized packet-switched NoC using split and merge primitives. In Proceedings of the 2012 International Conference on Field-Programmable Technology (FPT). 47--52.Google Scholar
- Mike Hutton. 2015. Understanding How the New HyperFlex Architecture Enables Next-Generation High-Performance Systems. Altera White Paper. Retrieved April 2015 from https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/wp/wp-01231-understanding-how-hyperflex-architecture-enables-high-performance-systems.pdf.Google Scholar
- N. Kapre and J. Gray. 2015. Hoplite: Building austere overlay NoCs for FPGAs. In Proceedings of the 2015 25th International Conference on Field Programmable Logic and Applications (FPL). 1--8.Google Scholar
- Nachiket Kapre, Nikil Mehta, Michael deLorimier, Raphael Rubin, Henry Barnor, Michael J. Wilson, Michael Wrighton, and Andre DeHon. 2006. Packet switched vs. time multiplexed FPGA overlay networks. In Proceedings of the 14th IEEE Symposium on Field-Programmable Custom Computing Machines. IEEE, 205--216. Google Scholar
Digital Library
- John Kim. 2009. Low-cost router microarchitecture for on-chip networks. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 255--266. Google Scholar
Digital Library
- B. S. Landman and Roy L. Russo. 1971. On a pin versus block relationship for partitions of logic graphs. IEEE Transactions on Computers 12 (1971), 1469--1479. Google Scholar
Digital Library
- G. Michelogiannakis, D. Sanchez, W. J. Dally, and C. Kozyrakis. 2010. Evaluating bufferless flow control for on-chip networks. In Proceedings of the 2010 4th ACM/IEEE International Symposium on Networks-on-Chip (NOCS) (2010), 9--16. Google Scholar
Digital Library
- Thomas Moscibroda, Onur Mutlu, Thomas Moscibroda, and Onur Mutlu. 2009. A Case for Bufferless Routing in On-chip Networks. Vol. 37. ACM, New York, NY. Google Scholar
Digital Library
- Michael K. Papamichael and James C. Hoe. 2012. CONNECT: Re-examining conventional wisdom for designing nocs in the context of FPGAs. In Proceedings of the ACM/SIGDA International Symposium. ACM Press, New York, NY, 37. Google Scholar
Digital Library
- Xilinx Inc. 2015. 7 Series FPGAs Configurable Logic Block User Guide. Retrieved February 2015 from http://www.xilinx.com/support/documentation/user_guides/ug474_7Series_CLB .pdf.Google Scholar
- Xilinx Inc. 2016a. 7 Series FPGAs Configurable Logic Block User Guide UG474. Technical Report. Xilinx Inc.Google Scholar
- Xilinx Inc. 2016b. UltraScale Architecture Configurable Logic Block User Guide UG574. Technical Report. Xilinx Inc.Google Scholar
Index Terms
Hoplite: A Deflection-Routed Directional Torus NoC for FPGAs
Recommendations
Architecture Exploration of Standard-Cell and FPGA-Overlay CGRAs Using the Open-Source CGRA-ME Framework
ISPD '18: Proceedings of the 2018 International Symposium on Physical DesignWe describe an open-source software framework,CGRA-ME, for the modeling and exploration of coarse-grained reconfigurable architectures (CGRAs). CGRAs are programmable hardware devices having large ALU-like logic blocks, and datapath bus-style inter-...
RIDER: Ring deflection router with buffers
The network-on-chip is becoming an increasingly important component of chip multiprocessors. Recently bufferless deflection routers were proposed, aiming to reduce hardware cost in comparison to classic virtual channel based routers, by eliminating ...
Streamlined network-on-chip for multicore embedded architectures
ARCS'12: Proceedings of the 25th international conference on Architecture of Computing SystemsMPSoCs are becoming complex systems incorporating a large number of compute cores as well as various accelerators and application specific units. To handle the communication in MPSoCs, the Network-on-Chip (NoC) concept has been proposed as a versatile ...






Comments