Abstract
FPGAs are becoming more heteregeneous to better adapt to different markets, motivating rapid exploration of different blocks/tiles for FPGAs. To evaluate a new FPGA architectural idea, one should be able to accurately obtain the area, delay, and energy consumption of the block of interest. However, current FPGA circuit design tools can only model simple, homogeneous FPGA architectures with basic logic blocks and also lack DSP and other heterogeneous block support. Modern FPGAs are instead composed of many different tiles, some of which are designed in a full custom style and some of which mix standard cell and full custom styles.
To fill this modelling gap, we introduce COFFE 2, an open-source FPGA design toolset for automatic FPGA circuit design. COFFE 2 uses a mix of full custom and standard cell flows and supports not only complex logic blocks with fracturable lookup tables and hard arithmetic but also arbitrary heterogeneous blocks. To validate COFFE 2 and demonstrate its features, we design and evaluate a multi-mode Stratix III-like DSP block and several logic tiles with fracturable LUTs and hard arithmetic. We also demonstrate how COFFE 2’s interface to VTR allows full evaluation of block-routing interfaces and various fracturable 6-LUT architectures.
- Predictive Technology Model. 2018. Retrieved from http://ptm.asu.edu/.Google Scholar
- Mohamed S. Abdelfattah and Vaughn Betz. 2012. Design tradeoffs for hard and soft FPGA-based networks-on-chip. In Proceedings of the International Conference on Field-Programmable Technology (FPT’12). 95--103.Google Scholar
- Iman Ahmadpour, Behnam Khaleghi, and Hossein Asadi. 2015. An efficient reconfigurable architecture by characterizing most frequent logic functions. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’15). 1--6.Google Scholar
Cross Ref
- Elias Ahmed and Jonathan Rose. 2004. The effect of LUT and cluster size on deep-submicron FPGA performance and density. IEEE Trans. Very Large Scale Integr. Syst. 12, 3 (2004), 288--298. Google Scholar
Digital Library
- Vaughn Betz and Jonathan Rose. 1997. Cluster-based logic blocks for FPGAs: Area-efficiency vs. input sharing and size. In Proceedings of the IEEE Custom Integrated Circuits Conference. 551--554.Google Scholar
Cross Ref
- Andrew Boutros, Sadegh Yazdanshenas, and Vaughn Betz. 2018. Embracing diversity: Enhanced DSP blocks for low-precision deep learning on FPGAs. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’18). IEEE, 1--7.Google Scholar
Cross Ref
- Robert Brayton and Alan Mishchenko. 2010. ABC: An academic industrial-strength verification tool. In Computer Aided Verification. Springer, 24--40. Google Scholar
Digital Library
- Charles Chiasson. 2013. Optimization and Modeling of FPGA Circuitry in Advanced Process Technology. Master’s thesis. University of Toronto.Google Scholar
- Charles Chiasson and Vaughn Betz. 2013. COFFE: Fully-automated transistor sizing for FPGAs. In Proceedings of the International Conference on Field-Programmable Technology (FPT’13). 34--41.Google Scholar
Cross Ref
- S. Alexander Chin et al. 2017. CGRA-ME: A unified framework for CGRA modelling and exploration. In Proceedings of the International Conference on Application-specific Systems, Architectures and Processors (ASAP’17). 184--189.Google Scholar
- Wenyi Feng et al. 2018. Improving FPGA performance with a S44 LUT structure. In Proceedings of the ACM/SIGDA International Symposium on Field-programmable Gate Arrays (FPGA’18). 61--66. Google Scholar
Digital Library
- John P. Fishburn and Alfred E. Dunlop. 1985. TILOS: A posynomial programming approach to transistor sizing. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’85). Springer, 326--328.Google Scholar
- Intel Corporation. 2017. Stratix 10 GX/SX device overview. Retrieved from https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/stratix-10/s10-overview.pdf.Google Scholar
- Peter Jamieson et al. 2010. Odin II-an open-source verilog HDL synthesis tool for CAD research. In Proceedings of the IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’10). 149--156. Google Scholar
Digital Library
- Jin Hee Kim and Jason H. Anderson. 2015. Synthesizable FPGA fabrics targetable by the verilog-to-routing (VTR) CAD flow. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’15). 1--8.Google Scholar
- Ian Kuon and Jonathan Rose. 2011. Exploring area and delay tradeoffs in FPGAs with architecture and automated transistor design. IEEE Trans. Very Large Scale Integr. Syst. 19, 1 (2011), 71--84. Google Scholar
Digital Library
- Lattice Semiconductor Corporation. 2016. CrossLink I2C hardened IP usage guide. Retrieved from https://www.latticesemi.com/Products/FPGAandCPLD/CrossLink.Google Scholar
- Lattice Semiconductor Corporation. 2016. Power management and calculation for CrossLink devices. Retrieved from https://www.latticesemi.com/Products/FPGAandCPLD/CrossLink.Google Scholar
- David Lewis et al. 2005. Fracturable lookup table and logic element. U.S. Patent 6943580.Google Scholar
- David Lewis et al. 2013. Architectural enhancements in Stratix V. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’13). 147--156. Google Scholar
Digital Library
- David Lewis, Elias Ahmed, Gregg Baeckler, Vaughn Betz, Mark Bourgeault, David Cashman, David Galloway, Mike Hutton, Chris Lane, Andy Lee, et al. 2005. The Stratix II logic and routing architecture. In Proceedings of the ACM/SIGDA International Symposium on Field-programmable Gate Arrays (FPGA’05). 14--20. Google Scholar
Digital Library
- Jason Luu et al. 2014. VTR 7.0: Next generation architecture and CAD system for FPGAs. ACM Trans. Reconfig. Technol. Syst. 7, 2 (2014), 6. Google Scholar
Digital Library
- Jason Luu, Conor McCullough, Sen Wang, Safeen Huda, Bo Yan, Charles Chiasson, Kenneth B. Kent, Jason Anderson, Jonathan Rose, and Vaughn Betz. 2014. On hard adders and carry chains in FPGAs. In Proceedings of the International Symposium on Field-Programmable Custom Computing Machines (FCCM’14). 52--59. Google Scholar
Digital Library
- Kevin E. Murray, Scott Whitty, Suya Liu, Jason Luu, and Vaughn Betz. 2013. Titan: Enabling large and complex benchmarks in academic CAD. In Proceedings of the International Conference onField Programmable Logic and Applications (FPL’13). 1--8.Google Scholar
Cross Ref
- Xinyu Niu, Wayne Luk, and Yu Wang. 2015. EURECA: On-chip configuration generation for effective dynamic data access. In Proceedings of the International Symposium on Field-Programmable Gate Arrays (FPGA’15). 74--83. Google Scholar
Digital Library
- Kosuke Tatsumura, Sadegh Yazdanshenas, and Vaughn Betz. 2016. High density, low energy, magnetic tunnel junction based block RAMs for memory-rich FPGAs. In Proceedings of the International Conference onField-Programmable Technology (FPT’16). 4--11.Google Scholar
Cross Ref
- Luc Thomas et al. 2014. Perpendicular spin transfer torque magnetic random access memories with high spin torque efficiency and thermal stability for embedded applications. J. Appl. Phys. 115, 17 (2014), 172615.Google Scholar
Cross Ref
- Steve Wilton et al. 2007. A synthesizable datapath-oriented embedded FPGA fabric. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’07). 33--41. Google Scholar
Digital Library
- Henry Wong et al. 2011. Comparing FPGA vs. custom CMOS and the impact on processor microarchitecture. In Proceedings of the International ACM/SIGDA Symposium on Field Programmable Gate Arrays (FPGA’11). 5--14. Google Scholar
Digital Library
- Xilinx Corporation. 2015. Virtex-5 family overview. Retrieved from https://www.xilinx.com/support/documentation/data_sheets/ds100.pdf.Google Scholar
- Xilinx Corporation. 2016. UltraScale architecture and product overview. Retrieved from https://www.xilinx.com/support/documentation/data_sheets/ds890-ultrascale-overview.pdf.Google Scholar
- Xilinx Corporation. 2017. Zynq UltraScale+ RFSoC data sheet: Overview. Retrieved from https://www.xilinx.com/support/documentation/data_sheets/ds889-zynq-usp-rfsoc-overview.pdf.Google Scholar
- Xilinx Corporation. 2018. Xilinx Unveils Revolutionary Adaptable Computing Product Category. Retrieved from https://www.xilinx.com/news/press/2018/xilinx-unveils-revolutionary-adaptable-computing-product-category.html.Google Scholar
- Saeyang Yang. 1991. Logic Synthesis and Optimization Benchmarks User Guide: Version 3.0. MCNC.Google Scholar
- Sadegh Yazdanshenas et al. 2017. Don’t forget the memory: Automatic block RAM modelling, optimization, and architecture exploration. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’17). 115--124. Google Scholar
Digital Library
- Sadegh Yazdanshenas and Vaughn Betz. 2017. Automatic circuit design and modelling for heterogeneous FPGAs. In Proceedings of the International Conference on Field-Programmable Technology (FPT’17).Google Scholar
Cross Ref
- Sadegh Yazdanshenas and Vaughn Betz. 2017. Quantifying and mitigating the costs of FPGA virtualization. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’17). 1--7.Google Scholar
Cross Ref
- Grace Zgheib et al. 2016. FPRESSO: Enabling express transistor-level exploration of FPGA architectures. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’16). 80--89. Google Scholar
Digital Library
- Grace Zgheib and Paolo Ienne. 2016. Automatic wire modeling to explore novel FPGA architectures. In Proceedings of the International Conference on Field-Programmable Technology (FPT’16). IEEE, 181--184.Google Scholar
Cross Ref
- Grace Zgheib and Paolo Ienne. 2017. Evaluating FPGA clusters under wide ranges of design parameters. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’17). 1--8.Google Scholar
Cross Ref
Index Terms
COFFE 2: Automatic Modelling and Optimization of Complex and Heterogeneous FPGA Architectures
Recommendations
An FPGA implementation for neural networks with the FDFM processor core approach
This paper presents a field programmable gate array FPGA implementation of a three-layer perceptron using the few DSP blocks and few block RAMs FDFM approach implemented in the Xilinx Virtex-6 family FPGA. In the FDFM approach, multiple processor cores ...
Heterogeneous-ASIF: an application specific inflexible FPGA using heterogeneous logic blocks (abstract only)
FPGA '10: Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arraysAn Application Specific Inflexible FPGA (ASIF) is an FPGA with reduced flexibility that can implement a set of application circuits which will operate at different times. Application circuits are initially placed and routed on an FPGA in such a way that ...
Application-Specific FPGA using heterogeneous logic blocks
This work presents a new automatic mechanism to explore the solution space between Field Programmable Gate Arrays (FPGAs) and Application-Specific Integrated Circuits (ASICs). This new solution is termed as an Application-Specific Inflexible FPGA (ASIF) ...






Comments