Abstract
One of the key obstacles to pervasive deployment of FPGA accelerators in data centers is their cumbersome programming model. Open source tooling is suggested as a way to develop alternative EDA tools to remedy this issue. Open source FPGA CAD tools have traditionally targeted academic hypothetical architectures, making them impractical for commercial devices. Recently, there have been efforts to develop open source back-end tools targeting commercial devices. These tools claim to follow an alternate data-driven approach that allows them to be more adaptable to the domain requirements such as faster compile time. In this paper, we present RWRoute, the first open source timing-driven router for UltraScale+ devices. RWRoute is built on the RapidWright framework and includes the essential and pragmatic features found in commercial FPGA routers that are often missing from open source tools. Another valuable contribution of this work is an open-source lightweight timing model with high fidelity timing approximations. By leveraging a combination of architectural knowledge, repeating patterns, and extensive analysis of Vivado timing reports, we obtain a slightly pessimistic, lumped delay model within 2% average accuracy of Vivado for UltraScale+ devices. Compared to Vivado, RWRoute results in a 4.9× compile time improvement at the expense of 10% Quality of Results (QoR) loss for 665 synthetic and six real designs. A main benefit of our router is enabling fast partial routing at the back-end of a domain-specific flow. Our initial results indicate that more than 9× compile time improvement is achievable for partial routing. The results of this paper show how such a router can be beneficial for a low touch flow to reduce dependency on commercial tools.
- [1] . 1997. VPR: A new packing, placement and routing tool for FPGA research. In Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications (FPL’97). 213–222. Google Scholar
Digital Library
- [2] . 2019. Maverick: A stand-alone CAD flow for partially reconfigurable FPGA modules. In 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 9–16. Google Scholar
Digital Library
- [3] . 2014. GROK-INT: Generating real on-chip knowledge for interconnect delays using timing extraction. In 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines. 88–95. Google Scholar
Digital Library
- [4] . 2014. GORK-LAB: Generating real on-chip knowledge for intra-cluster delays using timing extraction. ACM Transactions on Reconfigurable Technology and Systems 7, 4 (2014). Google Scholar
Digital Library
- [5] . 2015. RapidSmith 2: A framework for BEL-Level CAD exploration on Xilinx FPGAs. Association for Computing Machinery, New York, NY, USA. Google Scholar
Digital Library
- [6] . 2013. Escaping the academic sandbox: Realizing VPR circuits on Xilinx devices. In 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines. 45–52. Google Scholar
Digital Library
- [7] 2019. UltraScale Architecture and Product Data Sheet: Overview (DS890).Google Scholar
- [8] 2020. UltraScale Architecture Clocking Resources User Guide (UG572).Google Scholar
- [9] . 2010. Fidelity metrics for estimation models. In 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 1–8. Google Scholar
Digital Library
- [10] . 2018. RapidWright: Enabling custom crafted implementations for FPGAs. In 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 133–140.Google Scholar
Cross Ref
- [11] . 2019. Build your own domain-specific solutions with RapidWright: Invited tutorial. Association for Computing Machinery, New York, NY, USA, 14–22. Google Scholar
Digital Library
- [12] . 2011. RapidSmith: Do-it-yourself CAD tools for xilinx FPGAs. In 2011 21st International Conference on Field Programmable Logic and Applications. 349–355. Google Scholar
Digital Library
- [13] . 2019. Timing-aware routing in the RapidWright framework. In 2019 29th International Conference on Field Programmable Logic and Applications (FPL). 24–30.Google Scholar
Cross Ref
- [14] . 2019. RapidRoute: Fast assembly of communication structures for FPGA overlays. In 2019 27th International Symposium on Field-Programmable Custom Computing Machines (FCCM). 61–64.Google Scholar
- [15] . 2019. An open-source lightweight timing model for RapidWright. In 2019 International Conference on Field-Programmable Technology (ICFPT). 171–178.Google Scholar
Cross Ref
- [16] . 1995. PathFinder: A negotiation-based performance-driven router for FPGAs. In Third International ACM Symposium on Field-Programmable Gate Arrays. 111–117. Google Scholar
Digital Library
- [17] . 2020. SymbiFlow and VPR: An open-source design flow for commercial and novel FPGAs. IEEE Micro 40, 4 (2020), 49–57.Google Scholar
Digital Library
- [18] . 2021. https://github.com/Xilinx/RapidWright.Google Scholar
- [19] . 2003. Architecture-level performance evaluation of component-based embedded systems. In 2003 40th ACM/EDAC/IEEE Design Automation Conference (DAC). Google Scholar
Digital Library
- [20] . 2019. Yosys+nextpnr: An open source framework from Verilog to bitstream for commercial FPGAs. In 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 1–4.Google Scholar
- [21] . 2011. Torc: Towards an open-source tool flow. Association for Computing Machinery, New York, NY, USA, 41–44. Google Scholar
Digital Library
- [22] . 2021. FPGA interchange schema definitions. https://github.com/SymbiFlow/fpga-interchange-schema.Google Scholar
- [23] . 2021. Software-like compilation for data center FPGA accelerators. Association for Computing Machinery, New York, NY, USA, Article 3, 6 pages. Google Scholar
Digital Library
- [24] . 2013. A connection-based router for FPGAs. In 2013 International Conference on Field-Programmable Technology (FPT). 326–329.Google Scholar
Cross Ref
- [25] . 2019. CRoute: A fast high-quality timing-driven connection-based FPGA router. In 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 53–60.Google Scholar
Cross Ref
- [26] . 2020. Fast linking of separately-compiled FPGA blocks without a NoC. In 2020 International Conference on Field-Programmable Technology (ICFPT). 196–205.Google Scholar
Cross Ref
- [27] . 2021. nextpnr – a portable FPGA place and route tool. https://github.com/YosysHQ/nextpnr.Google Scholar
- [28] . 2018. Rosetta: A realistic high-level synthesis benchmark suite for software programmable FPGAs. Association for Computing Machinery, New York, NY, USA, 269–278. Google Scholar
Digital Library
- [29] . 2020. Accelerating FPGA routing through algorithmic enhancements and connection-aware parallelization. ACM Transactions on Reconfigurable Technology and Systems 13, 4 (2020), 26 pages.Google Scholar
Index Terms
RWRoute: An Open-source Timing-driven Router for Commercial FPGAs
Recommendations
A Flat Timing-Driven Placement Flow for Modern FPGAs
DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019In this paper, we propose a novel, flat analytic timing-driven placer without explicit packing for Xilinx UltraScale FPGA devices. Our work uses novel methods to simultaneously optimize for timing, wirelength and congestion throughout the global and ...
Enhancing timing-driven FPGA placement for pipelined netlists
DAC '08: Proceedings of the 45th annual Design Automation ConferenceFPGA application developers often use pipelining, C-slowing and retiming to improve the performance of their designs. Unfortunately, registered netlists present a fundamentally different problem to CAD tools, potentially limiting the benefit of these ...
Timing-driven partitioning-based placement for island style FPGAs
In traditional field programmable gate array (FPGA) placement methods, there is virtually no coupling between placement and routing. Performing simultaneous placement and detailed routing has been shown to generate much better placement qualities, but ...






Comments