skip to main content
research-article
Public Access

Loop Unrolling for Energy Efficiency in Low-Cost Field-Programmable Gate Arrays

Published:21 January 2019Publication History
Skip Abstract Section

Abstract

Field-programmable gate arrays (FPGAs) are used for a wide variety of computations in low-cost embedded systems. Although these systems often have modest performance constraints, their energy consumption must typically be limited. Many FPGA applications employ repetitive loops that cannot be straightforwardly split into parallel computations. Performing a loop sequentially generally requires high-speed clocks that consume considerable clock power and sometimes require clock generation using a phase-locked loop (PLL). Loop unrolling addresses the high-speed clock issue, but its use often leads to significant combinational glitch power.

In this work, a computer-aided design (CAD) approach that unrolls loops for designs targeted to low-cost FPGAs is described. Our approach considers latency constraints in an effort to minimize energy consumption for loop-based computation. To reduce glitch power, a glitch-filtering approach is introduced that provides a balance between glitch reduction and design performance. Glitch-filter enable signals are generated and routed to the filters using resources best suited to the target FPGA. Our approach automatically inserts glitch filters and associated control logic into a design prior to processing with FPGA synthesis, place, and route tools. Our energy-saving loop-unrolling approach has been evaluated using five benchmarks often used in low-cost FPGAs. The energy-saving capabilities of the approach have been evaluated for an Intel Cyclone IV and a Xilinx Artix-7 FPGA using board-level power measurement. The use of unrolling and glitch filtering is shown to reduce energy by at least 65% for an Artix-7 device and 50% for a Cyclone IV device while meeting design latency constraints.

References

  1. Altera. 2017. Altera Cyclone IV GX Development Board. Retrieved December 9, 2018 from https://www.altera.com/products/boards_and_kits/dev-kits/altera/kit-cyclone-iv-gx.html.Google ScholarGoogle Scholar
  2. R. Andraka. 1998. A survey of CORDIC algorithms for FPGA-based computers. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. ACM, 191--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Babb, M. Renard, C. Andras Moritz, W. Lee, M. Frank, R. Barua, and S. Amarasinghe. 1999. Parallelizing applications to silicon. In Proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines. IEEE. 70--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Banik, A. Bogdanov, F. Regazzoni, T. Isobe, H. Hiwatari, and T. Akishita. 2016. Round gating for low energy block ciphers. In Proceedings of Symposium on Hardware-Oriented Security and Trust. IEEE. 55--60.Google ScholarGoogle Scholar
  5. R. Beaulieu, S. Treatman-Clark, D. Shors, B. Weeks, J. Smith, and L. Wingers. 2015. The SIMON and SPECK lightweight block ciphers. In Proceedings of IEEE/ACM Design Automation Conference. ACM, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Boemo, J. Oliver, and G. Caffarena. 2013. Tracking the pipelining-power rule along the FPGA technical literature. In Proceedings of FPGAWorld. ACM, 9:1--9:5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Collins. 2011. Agile mixed signal addresses analog design challenges. White Paper, WP398 (v1. 0) August 15 (2011).Google ScholarGoogle Scholar
  8. Cyclone IV, Device Handbook. 2010. Vol. 1. Altera, Dec (2010).Google ScholarGoogle Scholar
  9. T. Czajkowski and S. Brown. 2007. Using negative edge triggered FFs to reduce glitching power in FPGA circuits. In IEEE/ACM Design Automation Conference. ACM, 324--329. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. N. Dhanuskodi and D. Holcomb. 2016. Energy optimization of unrolled block ciphers using combinational checkpointing. In Proceedings of Workshop on RFID Security and Privacy. Springer International Publishing.Google ScholarGoogle Scholar
  11. Q. Dinh, D. Chen, and M. D. F. Wong. 2010. A routing approach to reduce glitches in low power FPGAs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 2 (Feb. 2010), 235--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. O. Silvia Dragomir, T. Stefanov, and K. Bertels. 2009. Optimal loop unrolling and shifting for reconfigurable architectures. ACM Transactions on Reconfigurable Technology and Systems 2, 4 (Sept. 2009), 25:1--25:24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. K. Dumpala, S. B. Patil, D. E. Holcomb, and R. Tessier. 2017. Energy efficient loop unrolling for low-cost FPGAs. In Proceedings of the IEEE Conference on Field-Programmable Custom Computing Machines. Napa, CA, 17--20.Google ScholarGoogle Scholar
  14. D. Fick, N. Liu, Z. Foo, M. Fojtik, J. Seo, D. Sylvester, and D. Blaauw. 2010. In situ delay-slack monitor for high-performance processors using an all-digital self-calibrating 5ps resolution time-to-digital converter. In International Solid State Circuits Conference. Mira Digital Publishing, 23--25.Google ScholarGoogle Scholar
  15. H. Hsing. 2015. tiny_aes AES Core. Retrieved December 9, 2018 from http://opencores.org/project,tiny_aes.Google ScholarGoogle Scholar
  16. S. Huda and J. Anderson. 2016. Towards PVT-tolerant glitch-free operation in FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. ACM. 90--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. Kepa, D. Coburn, J. C. Dainty, and F. Morgan. 2008. High speed optical wavefront sensing with low cost FPGAs. Measurement Science Review 8, 4 (2008), 87--93.Google ScholarGoogle ScholarCross RefCross Ref
  18. S. Kerckhof, F. Durvaux, C. Hocquet, D. Bol, and F.-X. Standaert. 2012. Towards green cryptography: A comparison of lightweight ciphers from the energy viewpoint. In Proceedings of the Conference on Cryptographic Hardware and Embedded Systems. Springer International Publishing, 390--407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Lamoureux, G. Lemieux, and S. Wilton. 2008. GlitchLess: Dynamic power minimization in FPGAs through edge alignment and glitch filtering. IEEE Transactions on VLSI Systems 16, 11 (Nov. 2008), 1521--1534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Lim, K. Lee, Y. Cho, and N. Chang. 2005. Flip-flop insertion with shifted-phase clocks for FPGA power reduction. In IEEE/ACM International Conference on Computer-Aided Design. IEEE Computer Society, 335--342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. Musoll and J. Cortadella. 1995. Low-power array multipliers with transition retaining barriers. In 5th International Workshop on Power and Timing Modeling. Oldenburg University, 227--235.Google ScholarGoogle Scholar
  22. National Institute of Standards and Technology. 2001. Advanced Encryption Standard (AES). Federal Information Processing Standards Publication FIPS-197.Google ScholarGoogle Scholar
  23. J. Oliver, J. Pérez, and E. Boemo. 2014. Power estimations versus power measurements in Spartan-6 devices. In Southern Conference on Programmable Logic. IEEE Press, 1--5.Google ScholarGoogle Scholar
  24. J. Park, K. R. S. Shayee, and P. C. Diniz. 2004. Performance and area modeling of complete FPGA designs in the presence of loop transformations. IEEE Transactions on Computers 53, 11 (Nov. 2004), 1420--1435. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Ravishankar, J. H. Anderson, and A. Kennings. 2012. FPGA power reduction by guarded evaluation considering logic architecture. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 31, 9 (Aug. 2012), 1305--1318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. N. Rollins. 2007. Reducing Power in FPGA Designs Through Glitch Reduction. Ph.D. Dissertation. Brigham Young University, Provo, UT.Google ScholarGoogle Scholar
  27. W. Shum and J. H. Anderson. 2011. FPGA glitch power analysis and reduction. In Proceedings of the IEEE/ACM International Symposium on Low-Power Electronics and Design. IEEE Press, 27--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Usselmann. 2009. DES Core. Retrieved December 9, 2018 from http://opencores.org/project,des.Google ScholarGoogle Scholar
  29. S. Wilton, S. Ang, and W. Luk. 2004. The impact of pipelining on energy per operation in field-programmable gate arrays. In Proceedings of Conference on Field Programmable Logic and Application. Springer-Verlag Berlin Heidelberg, 719--728.Google ScholarGoogle Scholar
  30. J. Wu. 2010. Several key issues on implementing delay line based TDCs using FPGAs. IEEE Transactions on Nuclear Science 57, 3 (June 2010), 1543--1548.Google ScholarGoogle ScholarCross RefCross Ref
  31. Xilinx. 2017. Artix-7 35T Arty FPGA Evaluation Kit. Retrieved December 9, 2018 from http://www.xilinx.com/products/boards-and-kits/arty.html#documentation.Google ScholarGoogle Scholar
  32. M. Zuluaga. 2012. Sorting Network IP Generator. Retrieved December 9, 2018 from http://www.spiral.net/hardware/sort/sort.html.Google ScholarGoogle Scholar

Index Terms

  1. Loop Unrolling for Energy Efficiency in Low-Cost Field-Programmable Gate Arrays

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Reconfigurable Technology and Systems
      ACM Transactions on Reconfigurable Technology and Systems  Volume 11, Issue 4
      December 2018
      93 pages
      ISSN:1936-7406
      EISSN:1936-7414
      DOI:10.1145/3303942
      • Editor:
      • Steve Wilton
      Issue’s Table of Contents

      Copyright © 2019 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 January 2019
      • Accepted: 1 October 2018
      • Revised: 1 September 2018
      • Received: 1 April 2018
      Published in trets Volume 11, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)211
      • Downloads (Last 6 weeks)40

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!