Abstract
Packing is a key step in the FPGA tool flow that straddles the boundaries between synthesis, technology mapping and placement. Packing strongly influences circuit speed, density, and power, and in this article, we consider packing in the commercial FPGA context and examine the area and performance trade-offs associated with packing in a state-of-the-art FPGA---the Xilinx® VirtexTM-5 FPGA. In addition to look-up-table (LUT)-based logic blocks, modern FPGAs also contain large IP blocks. We discuss packing techniques for both types of blocks. Virtex-5 logic blocks contain dual-output 6-input LUTs. Such LUTs can implement any single logic function of up to 6 inputs, or any two logic functions requiring no more than 5 distinct inputs. The second LUT output has reduced speed, and therefore, must be used judiciously. We present techniques for dual-output LUT packing that lead to improved area-efficiency, with minimal performance degradation. We then describe packing techniques for large IP blocks, namely, block RAMs and DSPs. We pack circuits into the large blocks in a way that leverages the unique block RAM and DSP layout/architecture in Virtex-5, achieving significantly improved design performance.
- Ahmed, T., Kundarewich, P., Anderson, J., Taylor, B., and Aggarwal, R. 2008. Architecture-specific packing for Virtex-5 FPGAs. In Proceedings of the ACM International Symposium on Field-Programmable Gate Arrays. 5--13. Google Scholar
Digital Library
- Altera. 2003. FLEX 10K Programmable Logic Device Datasheet. Altera Corp., San Jose, CA.Google Scholar
- Betz, V. and Rose, J. 1997. Cluster-based logic blocks for FPGAs: Area-efficiency vs. input sharing and size. In Proceedings of the IEEE Custom Integrated Circuits Conference. 551--554.Google Scholar
- Chen, D. and Cong, J. 2004. Delay optimal low-power circuit clustering for FPGAs with dual supply voltages. In Proceedings of the ACM/IEEE International Symposium on Low-Power Electronics and Design. 70--73. Google Scholar
Digital Library
- Dehkordi, M. and Brown, S. 2002. The effect of cluster packing and node duplication control in delay driven clustering. In Proceedings of the IEEE International Conference on Field-Programmable Technology. 227--233.Google Scholar
- Eisenmann, H. and Johannes, F. 1998. Generic global placement and floor planning. In Proceedings of the ACM/IEEE Design Automation Conference. 269--274. Google Scholar
Digital Library
- Gupta, S., Anderson, J., Farragher, L., and Wang, Q. 2007. CAD techniques for power optimization in Virtex-5 FPGAs. In Proceedings of the IEEE Custom Integrated Circuits Conference. 85--88.Google Scholar
- Hassan, H., Anis, M., and Elmasry, M. 2005. Lap: A logic activity packing methodology for leakage power-tolerant FPGAs. In Proceedings of the ACM/IEEE International Symposium on Low-Power Electronics and Design. 257--262. Google Scholar
Digital Library
- Hutton, M., Schleicher, J., Lewis, D., Pedersen, B., Yuan, R., Kaptanoglu, S., Baeckler, G., Ratchev, B., Padalia, K., and et. el. 2004. Improving FPGA performance and area using an adaptive logic module. In Proceedings of the International Conference on Field-Programmable Logic and Applications. 135--144.Google Scholar
Cross Ref
- Jang, S., Chan, B., Chung, K., and Mishchenko, A. 2008. Wiremap: FPGA technology mapping for improved routability. In Proceedings of the ACM International Symposium on Field-Programmable Gate Arrays. 47--55. Google Scholar
Digital Library
- Kuon, I. and Rose, J. 2007. Measuring the gap between FPGAs and ASICs. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 26, 2, 203--215. Google Scholar
Digital Library
- Lamoureux, J. and Wilton, S. 2003. On the interaction between power-aware FPGA CAD algorithms. In Proceedings of the IEEE International Conference on Computer-Aided Design. 701--708. Google Scholar
Digital Library
- Lin, J., Chen, D., and Cong, J. 2006. Optimal simultaneous mapping and clustering for FPGA delay optimization. In Proceedings of the ACM/IEEE Design Automation Conference. 472--477. Google Scholar
Digital Library
- Marquardt, A., Betz, V., and Rose, J. 1999. Using cluster based logic blocks and timing-driven packing to improve FPGA speed and density. In Proceedings of the ACM International Symposium on Field-Programmable Gate Arrays. 37--46. Google Scholar
Digital Library
- Schabas, K. and Brown, S. 2003. Using logic duplication to improve performance in FPGAs. In Proceedings of the ACM International Symposium on Field-Programmable Gate Arrays. 136--142. Google Scholar
Digital Library
- Shang, L., Kaviani, A., and Bathala, K. 2002. Dynamic power consumption of the Virtex-2 FPGA family. In Proceedings of the ACM International Symposium on Field-Programmable Gate Arrays. 157--164. Google Scholar
Digital Library
- Singh, A. and Marek-Sadowska, M. 2002. Efficient circuit clustering for area and power reduction in FPGAs. In Proceedings of the ACM International Symposium on Field-Programmable Gate Arrays. 59--66. Google Scholar
Digital Library
- Viswanathan, N., Nam, G.-J., Alpert, C., Villarrubia, P., Ren, H., and Chu, C. 2007. RQL: Global placement via relaxed quadratic spreading and linearization. In Proceedings of the ACM/IEEE Design Automation Conference. 453--458. Google Scholar
Digital Library
- Xilinx. 2007. Virtex-5 XtremeDSP Design Considerations User Guide. Xilinx Inc., San Jose, CA.Google Scholar
Index Terms
Packing Techniques for Virtex-5 FPGAs
Recommendations
Architecture-specific packing for virtex-5 FPGAs
FPGA '08: Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arraysWe consider packing in the commercial FPGA context and examine the speed, performance and power trade-offs associated with packing in a state-of-the art FPGA -- the Xilinx Virtex-5 FPGA. Two aspects of packing are discussed: 1)packing for general logic ...
Clock power reduction for virtex-5 FPGAs
FPGA '09: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arraysClock network power in field-programmable gate arrays (FPGAs) is considered and two complementary approaches for clock power reduction in the Xilinx Virtex-5 FPGA are presented. The approaches are unique in that they leverage specific architectural ...
LegUp: An open-source high-level synthesis tool for FPGA-based processor/accelerator systems
Special issue on application-specific processorsIt is generally accepted that a custom hardware implementation of a set of computations will provide superior speed and energy efficiency relative to a software implementation. However, the cost and difficulty of hardware design is often prohibitive, ...






Comments