Abstract
Field-programmable gate arrays (FPGA) are an increasingly attractive alternative to traditional microprocessor-based computing architectures in extreme-computing domains, such as aerospace and supercomputing. FPGAs offer several resource types that offer different tradeoffs between speed, power, and area, which make FPGAs highly flexible for varying application computational requirements. However, since an application’s computational operations can map to different resource types, a major challenge in leveraging resource-diverse FPGAs is determining the optimal distribution of these operations across the device’s available resources for varying FPGA devices, resulting in an extremely large design space. In order to facilitate fast design-space exploration, this article presents a method based on linear programming (LP) that determines the optimal operation distribution for a particular device and application with respect to performance, power, or dependability metrics. Our LP method is an effective tool for exploring early designs by quickly analyzing thousands of FPGAs to determine the best FPGA devices and operation distributions, which significantly reduces design time. We demonstrate our LP method’s effectiveness with two case studies involving dot-product and distance-calculation kernels on a range of Virtex-5 FPGAs. Results show that our LP method selects optimal distributions of operations to within an average of 4% of actual values.
- V. D. Agrawal, M. L. Bushnell, G. Parthasarathy, and R. Ramadoss. 1999. Digital circuit design for minimum transient energy and a linear programming method. In Proceedings of the Twelfth International Conference on VLSI Design. 434--439. Google Scholar
Digital Library
- R. G. Bland. 1977. New finite pivoting rules for the simplex method. Math. Operat. Res. 2, 2 (1977), 103--107. Google Scholar
Digital Library
- C. L. Cole and J. L. Crassidis. 2006. Fast star-pattern recognition using planar triangles. J. Guid. Contr. Dynam. 29, 64--71.Google Scholar
Cross Ref
- R. Enzler, T. Jeger, D. Cottet, and G. Tröster. 2000. Proceedings of the 10th International Conference on Field-Programmable Logic and Applications: The Roadmap to Reconfigurable Computing (FPL’00). Springer Berlin. 525--534.Google Scholar
- M. Flynn and P. Hung. 2005. Microprocessor design issues: Thoughts on the road ahead. IEEE Micro. 25, 3, 16--31. Google Scholar
Digital Library
- A. George, H. Lam, and G. Stitt. 2011. Novo-G: At the forefront of scalable reconfigurable supercomputing. Comput. Sci. Eng. 13, 1 (2011), 82--86. Google Scholar
Digital Library
- Z. Guo, W. Najjar, F. Vahid, and K. Vissers. 2004. A quantitative analysis of the speedup factors of FPGAs over processors. In Proceedings of the 2004 ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays (FPGA’04). ACM, New York. 162--170. Google Scholar
Digital Library
- D. M. Hiemstra, G. Battiston, and P. Gill. 2010. Single event upset characterization of the virtex-5 field programmable gate array using proton irradiation. In Proceedings of the 2010 IEEE Radiation Effects Data Workshop (REDW’10). 1--4.Google Scholar
- B. Holland, K. Nagarajan, and A. D. George. 2009. RAT: RC amenability test for rapid performance prediction. ACM Trans. Reconfig. Technol. Syst. 1, 4, Article 22. Google Scholar
Digital Library
- M. Horowitz, E. Alon, D. Patil, S. Naffziger, R. Kumar, and K. Bernstein. 2005. Scaling, power, and the future of CMOS. In Proceedings of the IEEE International Electron Devices Meeting, 2005. IEDM Technical Digest. 7--15.Google Scholar
- H. Iwai. 2015. Future of nano {CMOS} technology. Solid-State Electron. 112 (2015), 56--67.Google Scholar
Cross Ref
- D. L. Landis, J. R. Samson, and J. H. Aldridge. 1990. Defect and Fault Tolerance in VLSI Systems: Volume 2. Springer, Boston, MA. 267--281.Google Scholar
- N. R. Mahapatra and B. Venkatrao. 1999. The processor-memory bottleneck: Problems and solutions. Crossroads 5, 3, Article 2. Google Scholar
Digital Library
- M. R. Meswani, L. Carrington, D. Unat, A. Snavely, S. Baden, and S. Poole. 2012. Modeling and predicting performance of high performance computing applications on hardware accelerators. In Proceedings of the IEEE 26th International Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW’12). 1828--1837. Google Scholar
Digital Library
- D. Petrick, A. Geist, D. Albaijes, M. Davis, P. Sparacino, G. Crum, R. Ripley, J. Boblitt, and T. Flatley. 2014. SpaceCube v2.0 space flight hybrid reconfigurable data processing system. In Proceedings of the 2014 IEEE Aerospace Conference. 1--20.Google Scholar
- A. Putnam, A. M. Caulfield, E. S. Chung, D. Chiou, K. Constantinides, J. Demme, H. Esmaeilzadeh, J. Fowers, G. P. Gopal, J. Gray, M. Haselman, S. Hauck, S. Heil, A. Hormati, J.-Y. Kim, S. Lanka, J. Larus, E. Peterson, S. Pope, A. Smith, J. Thong, P. Y. Xiao, and D. Burger. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. In Proceedings of the 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA’14). 13--24. Google Scholar
Digital Library
- H. Quinn, K. Morgan, P. Graham, J. Krone, and M. Caffrey. 2007. Static proton and heavy ion testing of the Xilinx virtex-5 device. In Proceedings of the IEEE Radiation Effects Data Workshop. 177--184.Google Scholar
- J. W. Richardson, A. D. George, and H. Lam. 2012. Performance analysis of GPU accelerators with realizable utilization of computational density. In Proceedings of the 2012 Symposium on Application Accelerators in High Performance Computing (SAAHPC’12). 137--140. Google Scholar
Digital Library
- D. Rudolph, C. Wilson, J. Stewart, P. Gauvin, G. Crum, A. D. George, M. Wirthlin, and H. Lam. 2014. CSP: A multifaceted hybrid system for space computing. In Proceedings of the 28th Annual AIAA/USU Conference on Small Satellites. 1--7.Google Scholar
- K. Srinivasan, K. S. Chatha, and G. Konjevod. 2006. Linear-programming-based techniques for synthesis of network-on-chip architectures. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 14, 4, 407--420. Google Scholar
Digital Library
- A. J. Tylka, J. H. Adams, P. R. Boberg, B. Brownstein, W. F. Dietrich, E. O. Flueckiger, E. L. Petersen, M. A. Shea, D. F. Smart, and E.C. Smith. 1997. CREME96: A revision of the cosmic ray effects on micro-electronics code. IEEE Trans. Nucl. Sci. 44, 6, 2150--2160.Google Scholar
Cross Ref
- K. Underwood. 2004. FPGAs vs. CPUs: Trends in peak floating-point performance. In Proceedings of the 2004 ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays (FPGA’04). ACM, New York. 171--180. Google Scholar
Digital Library
- R. J. Vanderbei. 2001. Linear Programming: Foundations and Extensions. Springer.Google Scholar
Cross Ref
- J. Williams, A. D. George, J. Richardson, K. Gosrani, C. Massie, and H. Lam. 2010. Characterization of fixed and reconfigurable multi-core devices for application acceleration. ACM Trans. Reconfig. Technol. Syst. 3, 4, 19:1--19:29. Google Scholar
Digital Library
Index Terms
Optimizing FPGA Performance, Power, and Dependability with Linear Programming
Recommendations
Low-power programmable FPGA routing circuitry
We consider circuit techniques for reducing field-programmable gate-array (FPGA) power consumption and propose a family of new FPGA routing switch designs that are programmable to operate in three different modes: high-speed, low-power, or sleep. High-...
Implementing high-performance, low-power FPGA-based optical flow accelerators in C
ASAP '13: Proceedings of the 2013 IEEE 24th International Conference on Application-specific Systems, Architectures and Processors (ASAP)Recent developments in High-Level Synthesis (HLS) for FPGAs are making it possible to “run” C code on FPGAs thereby making modern programming environments available to FPGA developers. In this paper, C code for a complex optical-flow algorithm is ...
MiCAP-Pro: a high speed custom reconfiguration controller for Dynamic Circuit Specialization
Dynamic Circuit Specialization (DCS) is used to optimize parts of an application and switch between the specialized parts utilizing Partial Reconfiguration at the run-time. The time needed to reconfigure the FPGA is a limiting factor for DCS. The ...






Comments