skip to main content
research-article

Fast Design Exploration for Performance, Power and Accuracy Tradeoffs in FPGA-Based Accelerators

Published:01 February 2014Publication History
Skip Abstract Section

Abstract

The ease-of-use and reconfigurability of FPGAs makes them an attractive platform for accelerating algorithms. However, accelerating becomes a challenging task as the large number of possible design parameters lead to different accelerator variants. In this article, we propose techniques for fast design exploration and multi-objective optimization to quickly identify both algorithmic and hardware parameters that optimize these accelerators. This information is used to run regression analysis and train mathematical models within a nonlinear optimization framework to identify the optimal algorithm and design parameters under various objectives and constraints. To automate and improve the model generation process, we propose the use of L1-regularized least squares regression techniques.We implement two real-time image processing accelerators as test cases: one for image deblurring and one for block matching. For these designs, we demonstrate that by sampling only a small fraction of the design space (0.42% and 1.1%), our modeling techniques are accurate within 2%--4% for area and throughput, 8%--9% for power, and 5%--6% for arithmetic accuracy. We show speedups of 340× and 90× in time for the test cases compared to brute-force enumeration. We also identify the optimal set of parameters for a number of scenarios (e.g., minimizing power under arithmetic inaccuracy bounds).

References

  1. Giuseppe Ascia, Vincenzo Catania, and Maurizi Palesi. 2002. A framework for design space exploration of parameterized VLSI systems. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC’02). IEEE Computer Society, Los Alamitos, CA, 245--250. http://dl.acm.org/citation.cfm?id=832284.835448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Richard H. Byrd, Jean Charles Gilbert, and Jorge Nocedal. 1996. A trust region method based on interior point techniques for nonlinear programming. Math. Prog. 89, 149--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Carrion Schafer and K. Wakabayashi. 2012. Machine learning predictive modelling high-level synthesis design space exploration. IET Comput. Digit. Tech. 6, 3, 153--159. DOI:http://dx.doi.org/10.1049/iet-cdt.2011.0115.Google ScholarGoogle ScholarCross RefCross Ref
  4. Deming Chen, Jason Cong, Yiping Fan, and Zhiru Zhang. 2007. High-level power estimation and low-power design space exploration for FPGAs. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC’07). IEEE Computer Society, Los Alamitos, CA, 529--534. DOI:http://dx.doi.org/10.1109/ASPDAC.2007.358040. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Das, S. J. E. Wilton, P. Leong, and W. Luk. 2009. Modeling post-techmapping and post-clustering FPGA circuit depth. In Proceedings of the Field Programmable Logic and Applications (FPL’09). 205--211. DOI:http://dx.doi.org/10.1109/FPL.2009.5272315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Anders Forsgren, Philip E. Gill, and Margaret H. Wright. 2002. Interior methods for nonlinear optimization. SIAM Rev. 44, 4, 525--597. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Givargis, F. Vahid, and J. Henkel. 2001. System-level exploration for Pareto-optimal configurations in parameterized systems-on-a-chip. In Proceedings of the IEEE/ACM International Conference on Computer Aided Design (ICCAD’01). 25--30. DOI:http://dx.doi.org/10.1109/ICCAD.2001.968593. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ali Irturk, Bridget Benson, Shahnam Mirzaei, and Ryan Kastner. 2010. GUSTO: An automatic generation and optimization tool for matrix inversion architectures. ACM Trans. Embed. Comput. Syst. 9, 4, Article 32. DOI:http://dx.doi.org/10.1145/1721695.1721698. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Tianyi Jiang, Xiaoyong Tang, and Prith Banerjee. 2004. Macro-models for high level area and power estimation on FPGAs. In Proceedings of the 14th ACM Great Lakes Symposium on VLSI (GLSVLSI’04). ACM, New York, 162--165. DOI:http://dx.doi.org/10.1145/988952.988992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Kalaycioglu, O. Ulusel, and I. Hamzaoglu. 2009. Low power techniques for motion estimation hardware. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’09). 180--185. DOI:http://dx.doi.org/10.1109/FPL.2009.5272508.Google ScholarGoogle Scholar
  11. Braislav Kisacanin, Shuvra S. Bhattacharyya, and Sek Chai. 2009. Embedded Computer Vision. Springer London. DOI:http://dx.doi.org/10.1007/978-1-84800-304-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kwangmoo Koh, Seung-Jean Kim, and Stephen Boyd. 2007. An interior-point method for large-scale l1-regularized logistic regression. J. Mach. Learn. Res. 8, 1519--1555. http://dl.acm.org/citation.cfm?id=1314498.1314550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ron Kohavi and others. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI). Vol. 14, 1137--1145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Benjamin C. Lee and David M. Brooks. 2006. Accurate and efficient regression modeling for microarchitectural performance and power prediction. SIGOPS Oper. Syst. Rev. 40, 5, 185--194. DOI:http://dx.doi.org/10.1145/1168917.1168881. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. C. Lee and D. Brooks. 2008. Roughness of microarchitectural design topologies and its implications for optimization. In Proceedings of the IEEE 14th International Symposium on High Performance Computer Architecture (HPCA’08). IEEE, 240--251. DOI:http://dx.doi.org/10.1109/HPCA.2008.4658643.Google ScholarGoogle Scholar
  16. K. Nepal, O. Ulusel, R. I. Bahar, and S. Reda. 2012. Fast multi-objective algorithmic design co-exploration for FPGA-based accelerators. In Proceedings of the IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 65--68. DOI:http://dx.doi.org/10.1109/FCCM.2012.21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Gianluca Palermo, Cristina Silvano, and Vittorio Zaccaria. 2005. Multi-objective design space exploration of embedded systems. J. Embedded Comput. 1, 3, 305--316. http://dl.acm.org/citation.cfm?id=1233748.1233750. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Rossi, C. Mucci, F. Campi, S. Spolzino, L. Vanzolini, H. Sahlbach, S. Whitty, R. Ernst, W. Putzke-Roming, and R. Guerrieri. 2013. Application space exploration of a heterogeneous run-time configurable digital signal processor. IEEE Trans. VLSI Syst. 21, 2, 193--205. DOI:http://dx.doi.org/10.1109/TVLSI.2012.2185963. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. David Sheldon and Frank Vahid. 2009. Making good points: Application-specific pareto-point generation for design space exploration using statistical methods. In Proceeding of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’09). ACM, New York, 123--132. DOI:http://dx.doi.org/10.1145/1508128.1508149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lee Chee Sing and Ha Yajun. 2005. Design space exploration for arbitrary FPGA architectures. In Proceedings of the 2nd International Conference on Embedded Software and Systems (ICESS’05). IEEE Computer Society, Los Alamitos, CA, 269--275. DOI:http://dx.doi.org/10.1109/ICESS.2005.46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Alastair M. Smith, Steven J. E. Wilton, and Joydip Das. 2009. Wirelength modeling for homogeneous and heterogeneous FPGA architectural development. In Proceeding of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’09). ACM, New York, 181--190. DOI:http://dx.doi.org/10.1145/1508128.1508156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Byoungro So, Mary W. Hall, and Pedro C. Diniz. 2002. A compiler approach to fast hardware design space exploration in FPGA-based systems. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’02). ACM, New York, 165--176. DOI:http://dx.doi.org/10.1145/512529.512550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kuen Hung Tsoi and Wayne Luk. 2011. Power profiling and optimization for heterogeneous multi-core systems. SIGARCH Comput. Archit. News 39, 4, 8--13. DOI:http://dx.doi.org/10.1145/2082156.2082159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xilinx. 2011. ML605 Hardware User Guide. Xilinx.Google ScholarGoogle Scholar

Index Terms

  1. Fast Design Exploration for Performance, Power and Accuracy Tradeoffs in FPGA-Based Accelerators

        Recommendations

        Reviews

        Kyle Rupnow

        Design space exploration is a critical development stage for hardware design. With the growing complexity of applications and the corresponding hardware we design to accelerate them, the task of exploring the wide variety of potential hardware architectures is challenging and time-consuming. In this domain, hardware designers must use heuristics to intelligently explore the design space. Even in small designs there may be hundreds or thousands of possible design points to evaluate, and it is rarely feasible to evaluate all possible design points. In this paper, the authors propose an extension to design space exploration that not only explores the design space, but also automatically generates the design models used to perform the design space exploration. In automating the step of model creation through the use of an L 1-regularized least squares regression technique, designers now have the ability to add design parameters and automatically infer which parameters influence the quality of output designs. In this paper, there is both a description of the model generation technique as well as a case study using two computer vision algorithms. Although the computer vision algorithms are simple, well-studied algorithms, and the parameters explored are also comparatively simple, the authors present a compelling vision of a system where both the generation of design quality models and design space exploration using those models are automated. The studied design parameters are intuitive parameters that have a clearly expected influence on one or more measurement metrics. In this sense, the results of automated model generation are unsurprising, as the parameters are expected to have these effects. However, the authors demonstrate the excellent performance of selected designs, as well as a technique that opens the door to automatic model generation and design space exploration of much larger design spaces. This initial paper has significant promise, as this could fill a critical need in model generation and support efficient design space exploration. Although further study is needed to ensure that this technique remains feasible and scalable as we scale the size of designs, the number of explored design parameters, and the number of interacting design blocks under simultaneous exploration, this paper presents a vision of automated model generation and exploration that takes an important step toward producing high-quality hardware implementations without significant pre-characterization or designer guidance during the exploration process. Online Computing Reviews Service

        Access critical reviews of Computing literature here

        Become a reviewer for Computing Reviews.

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!