Abstract
One of the key challenges for the FPGA industry going forward is to make the task of designing hardware easier. A significant portion of that design task is the creation of the interconnect pathways between functional structures. We present a synthesis tool that automates this process and focuses on the interconnect needs in the fine-grained (sub-IP-block) design space. Here there are several issues that prior research and tools do not address well: the need to have fixed, deterministic latency between communicating units (to enable high-performance local communication without the area overheads of latency insensitivity), and the ability to avoid generating unnecessary arbitration hardware when the application design can avoid it. Using a design example, our tool generates interconnect that requires 69% fewer lines of specification code than a handwritten Verilog implementation, which is a 32% overall reduction for the entire application. The resulting system, while requiring 6% more total functional and interconnect area, achieves the same performance. We also show a quantitative and qualitative advantages against an existing commercial interconnect synthesis tool, over which we achieve a 25% performance advantage and 15%/57% logic/memory area savings.
- Altera Corporation. 2015. QSys—Altera’s System Integration Tool. Retrieved June 30, 2016, from http://www.altera.com/products/software/quartus-ii/subscription-edition/qsys/qts-qsys.html.Google Scholar
- ARM Ltd. 2015. AMBA Specifications. Retrieved June 30, 2016, from http://www.arm.com/products/system-ip/amba/amba-open-specifications.php.Google Scholar
- L. P. Carloni, K. L. McMillan, and A. L. Sangiovanni-Vincentelli. 2001. Theory of latency-insensitive design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 20, 9, 1059--1076. DOI:http://dx.doi.org/10.1109/43.945302 Google Scholar
Digital Library
- J. Carmona, J. Cortadella, M. Kishinevsky, and A. Taubin. 2009. Elastic circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 28, 10, 1437--1455. DOI:http://dx.doi.org/10.1109/TCAD.2009.2030436 Google Scholar
Digital Library
- CLOC. 2015. CLOC: Count Lines of Code. Retrieved June 30, 2016, from http://cloc.sourceforge.net/.Google Scholar
- Jason Cong, Yuhui Huang, and Bo Yuan. 2011. A tree-based topology synthesis for on-chip network. In Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD’11). IEEE, Los Alamitos, CA, 651--658. DOI:http://dx.doi.org/10.1109/ICCAD.2011.6105399 Google Scholar
Digital Library
- E. Dahlhaus, D. S. Johnson, C. H. Papadimitriou, P. D. Seymour, and M. Yannakakis. 1992. The complexity of multiway cuts (extended abstract). In Proceedings of the 24th Annual ACM Symposium on Theory of Computing (STOC’92). ACM, New York, NY, 241--251. DOI:http://dx.doi.org/10.1145/129712.129736 Google Scholar
Digital Library
- Yutian Huan and A. DeHon. 2012. FPGA optimized packet-switched NoC using split and merge primitives. In Proceedings of the 2012 International Conference on Field-Programmable Technology (FPT’12). 47--52. DOI:http://dx.doi.org/10.1109/FPT.2012.6412110Google Scholar
Cross Ref
- Lattice Semiconductor. 2015. LatticeMico System Development Tools. Retrieved June 30, 2016, from http://bit.ly/1fsLLj6.Google Scholar
- U. Y. Ogras and R. Marculescu. 2005. Energy- and performance-driven NoC communication architecture synthesis using a decomposition approach. In Proceedings of the Design, Automation, and Test in Europe Conference, Vol. 1. 352--357. DOI:http://dx.doi.org/10.1109/DATE.2005.137 Google Scholar
Digital Library
- Michael K. Papamichael and James C. Hoe. 2012. CONNECT: Re-examining conventional wisdom for designing NoCs in the context of FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’12). ACM, New York, NY, 37--46. DOI:http://dx.doi.org/10.1145/2145694.2145703 Google Scholar
Digital Library
- Alessandro Pinto, Luca P. Carloni, and Alberto L. Sangiovanni-Vincentelli. 2003. Efficient synthesis of networks on chip. In Proceedings of the 21st International Conference on Computer Design (ICCD’03). 146--150. Google Scholar
Digital Library
- PUC-Rio. 2015. The Programming Language Lua. Retrieved June 30, 2016, from http://www.lua.org/.Google Scholar
- Alex Rodionov, David Biancolin, and Jonathan Rose. 2015. Fine-grained interconnect synthesis. In Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’15). ACM, New York, NY, 46--55. DOI:http://dx.doi.org/10.1145/2684746.2689061 Google Scholar
Digital Library
- V. Todorov, D. Mueller-Gritschneder, H. Reinig, and U. Schlichtmann. 2014. Deterministic synthesis of hybrid application-specific network-on-chip topologies. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 33, 10, 1503--1516. DOI:http://dx.doi.org/10.1109/TCAD.2014.2331556.Google Scholar
Cross Ref
- Xilinx Corporation. 2015. Accelerating Integration. Retrieved June 30, 2016, from http://www.xilinx.com/products/design-tools/vivado/integration/.Google Scholar
- Wei Zhang, Vaughn Betz, and Jonathan Rose. 2012. Portable and scalable FPGA-based acceleration of a direct linear system solver. ACM Transactions on Reconfigurable Technology and Systems 5, 1, Article No. 6. DOI:http://dx.doi.org/10.1145/2133352.2133358 Google Scholar
Digital Library
Index Terms
Fine-Grained Interconnect Synthesis
Recommendations
Fine-Grained Interconnect Synthesis
FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysOne of the key challenges for the FPGA industry going forward is to make the task of designing hardware easier. A significant portion of that design task is the creation of the interconnect pathways between functional structures. We present a synthesis ...
Tofu Interconnect 2: System-on-Chip Integration of High-Performance Interconnect
ISC 2014: Proceedings of the 29th International Conference on Supercomputing - Volume 8488The Tofu Interconnect 2 Tofu2 is a system interconnect designed for the Fujitsu's next generation successor to the PRIMEHPC FX10 supercomputer. Tofu2 inherited the 6-dimensional mesh/torus network topology from its predecessor, and it increases the link ...
The Tofu Interconnect
The Tofu interconnect uses a 6D mesh/torus topology in which each cubic fragment of the network has the embeddability of a 3D torus graph, allowing users to run multiple topology-aware applications. This article describes the Tofu interconnect ...






Comments