Abstract
As FPGA capacity increases, a growing challenge is connecting ever-more components with the current low-level FPGA interconnect while keeping designers productive and on-chip communication efficient. We propose augmenting FPGAs with networks-on-chip (NoCs) to simplify design, and we show that this can be done while maintaining or even improving silicon efficiency. We compare the area and speed efficiency of each NoC component when implemented hard versus soft to explore the space and inform our design choices. We then build on this component-level analysis to architect hard NoCs and integrate them into the FPGA fabric; these NoCs are on average 20--23× smaller and 5--6× faster than soft NoCs. A 64-node hard NoC uses only ∼2% of an FPGA's silicon area and metallization. We introduce a new communication efficiency metric: silicon area required per realized communication bandwidth. Soft NoCs consume 4960 mm2/TBps, but hard NoCs are 84× more efficient at 59 mm2/TBps. Informed design can further reduce the area overhead of NoCs to 23 mm2/TBps, which is only 2.6× less efficient than the simplest point-to-point soft links (9 mm2/TBps). Despite this almost comparable efficiency, NoCs can switch data across the entire FPGA while point-to-point links are very limited in capability; therefore, hard NoCs are expected to improve FPGA efficiency for more complex styles of communication.
- M. S. Abdelfattah and V. Betz. 2012. Design tradeoffs for hard and soft FPGA-based networks-on-chip. In Proceedings of the International Conference on Field-Programmable Technology (FPT'12). 95--103.Google Scholar
- Altera Corp. 2007. Stratix III FPGA: Lowest power, highest performance 65-nm FPGA. http://www.altera.com/devices/fpga/stratix-fpgas/stratix-iii/st3-index.jsp.Google Scholar
- J. Balfour and W. J. Dally. 2006. Design tradeoffs for tiled cmp on-chip networks. In Proceedings of the 20th Annual International Conference on Supercomputing (ICS'06). 187--198. Google Scholar
Digital Library
- D. U. Becker. 2012. Efficient microarchitecture for NoC router. Ph.D. dissertation, Stanford University.Google Scholar
- D. U. Becker and W. J. Dally. 2009. Allocator implementations for network-on-chip routers. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC'09). 1--12. Google Scholar
Digital Library
- H. Bhatnagar. 2002. Advanced ASIC Chip Synthesis using Synopsys Design Compiler, Physical Compiler and Primetime. Kluwer Academic Publishers, Norwell, MA. Google Scholar
Digital Library
- E. S. Chung, J. C. Hoe, and K. Mai. 2011. CoRAM: An in-fabric memory architecture for FPGA-based computing. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA'11). 97--106. Google Scholar
Digital Library
- W. J. Dally and B. Towles. 2001. Route packets, not wires: On-chip interconnection networks. In Proceedings of the 38th Annual Design Automation Conference (DAC'01). 684--689. Google Scholar
Digital Library
- W. J. Dally and B. Towles. 2004. Principles and Practices of Interconnection Networks. Morgan Kaufmann, San Fransisco. Google Scholar
Digital Library
- R. Francis and S. Moore. 2008. Exploring hard and soft networks-on-chip for FPGAs. In Proceedings of the International Conference on ICECE Technology (FPT'08). 261--264.Google Scholar
- K. Goossens, M. Bennebroek, J. Y. Hur, and M. A. Wahlah. 2008. Hardwired networks on chip in FPGAs to unify functional and configuration interconnects. In Proceedings of the 2nd ACM/IEEE International Symposium on Networks-on-Chip (NOCS'08). 45--54. Google Scholar
Digital Library
- R. Ho, K. W. Mai, and M. A. Horowitz. 2001. The future of wires. Proc. IEEE 89, 4, 490--504.Google Scholar
Cross Ref
- Y. Huan and A. Dehon. 2012. FPGA optimized packet-switched NoC using split and merge primitives. In Proceedings of the International Conference on Field-Programmable Technology (FPT'12). 47--52.Google Scholar
- M. Hutton, D. Karchmer, B. Archell, and J. Govig. 2005. Efficient static timing analysis and applications using edge masks. In Proceedings of the 13th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA'05). 174--183. Google Scholar
Digital Library
- I. Kuon and J. Rose. 2007. Measuring the gap between FPGAs and ASICs. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 26, 2, 203--215. Google Scholar
Digital Library
- J. Lee and L. Shannon. 2010. Predicting the performance of application-specific NoCs implemented on FPGAs. In Proceedings of the 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA'10). 23--32. Google Scholar
Digital Library
- D. Lewis, D. Cashman, M. Chan, J. Chromczak, G. Lai, A. Lee, T. Vanderhoek, and H. Yu. 2013. Architectural enhancements in Stratix v. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA'13). 147--156. Google Scholar
Digital Library
- M. K. Papamichael and J. C. Hoe. 2012. CONNECT: Re-examining conventional wisdom for designing NoCs in the context of FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA'12). 37--46. Google Scholar
Digital Library
- G. Passas, M. Katevenis, and D. Pnevmatikatos. 2012. Crossbar NoCs are scalable beyond 100 nodes. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 31, 4, 573--585. Google Scholar
Digital Library
- G. Schelle and D. Grunwald. 2008. Exploring FPGA network on chip implementations across various application and network loads. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL'08). 41--46.Google Scholar
- R. Scoville. 2010. TimeQuest User Guide. Wiki Release.Google Scholar
- B. Sethuraman, P. Bhattacharya, J. Khan, and R. Vemuri. 2005. LiPaR: A light-weight parallel router for FPGA-based networks-on-chip. In Proceedings of the 15th ACM Great Lakes Symposium on VLSI (GLSVLSI'05). 452--457. Google Scholar
Digital Library
- Synopsys. 2010. Design compiler optimization reference manual. http://cleroux.vvv.enseirb-matmeca.fr/EN216/doc/dcrmo.pdf.Google Scholar
- Y. Tamir and G. L. Frazier. 1988. High-performance multi-queue buffers for VLSI communication switches. In Proceedings of the 15th Annual International Symposium on Computer Architecture (ISCA'88). 343--354. Google Scholar
Digital Library
- L. G. Valiant and G. J. Brebner. 1981. Universal schemes for parallel communication. In Proceedings of the 13th Annual ACM Symposium on Theory of Computing (STOC'81). 263--277. Google Scholar
Digital Library
- H. Wong, V. Betz, and J. Rose. 2011. Comparing FPGA vs. custom CMOs and the impact on processor microarchitecture. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA'11). 5--14. Google Scholar
Digital Library
Index Terms
Networks-on-Chip for FPGAs: Hard, Soft or Mixed?
Recommendations
An improved transmission scheme for error-prone inter-chip network-on-chip communication links implemented on FPGAs
FPGAworld '13: Proceedings of the 10th FPGAworld ConferenceNetwork-on-Chip (NoC) is an alternative to traditional busses for faster interconnect mechanism. The aim is to have infinite scalability, and this implies the possibility to extend the on-chip NoC communication protocol off-chip. To gain wholesome ...
On the area and energy scalability of wireless network-on-chip: a model-based benchmarked design space exploration
Networks-on-chip (NoCs) are emerging as the way to interconnect the processing cores and the memory within a chip multiprocessor. As recent years have seen a significant increase in the number of cores per chip, it is crucial to guarantee the ...
Exploring hybrid photonic networks-on-chip foremerging chip multiprocessors
CODES+ISSS '09: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesisIncreasing application complexity and improvements in process technology have today enabled chip multiprocessors (CMPs) with tens to hundreds of cores on a chip. Networks on Chip (NoCs) have emerged as scalable communication fabrics that can support ...






Comments