skip to main content
research-article

Exploiting data-level parallelism for energy-efficient implementation of LDPC decoders and DCT on an FPGA

Published:28 December 2011Publication History
Skip Abstract Section

Abstract

We explore the use of Data-Level Parallelism (DLP) as a way of improving the energy efficiency and power consumption involved in running applications on an FPGA. We show that static power consumption is a significant fraction of the overall power consumption in an FPGA and that it does not change significantly even as the area required by an architecture increases, because of the dominance of interconnect in an FPGA. We show that the degree of DLP can be used in conjunction with frequency scaling to reduce the overall power consumption.

References

  1. Anderson, J. and Najm, F. 2006. Active leakage power optimization for FPGAs. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 25, 3, 423--437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bates, S., Gunthorpe, L., Emre Pusane, A., Chen, Z., Zigangirov, K., and Costello, D. 2006. Decoders for low-density parity-check convolutional codes with large memory. In Proceedings of the IEEE International Symposium on Circuits and Systems.Google ScholarGoogle Scholar
  3. Chen, D., Cong, J., Fan, Y., and Zhang, Z. 2007. High-Level power estimation and low-power design space exploration for FPGAs. In Proceedings of the Asia and South Pacific Design Automation Conference. 529--534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen, L., Xu, J., Djurdjevic, I., and Lin, S. 2004. Near-Shannon-Limit quasi-cyclic low-density parity-check codes. IEEE Trans. Comm. 52, 7, 1038--1042.Google ScholarGoogle ScholarCross RefCross Ref
  5. Chen, X., Huang, Q., Lin, S., and Akella, V. 2009a. FPGA-Based low-complexity high-throughput tri-mode decoder for quasi-cyclic LDPC codes. In Proceedings of the Allerton Conference on Communication, Control, and Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chen, X., Kang, J., Lin, S., and Akella, V. 2009b. Accelerating FPGA-based emulation of quasi-cyclic LDPC codes with vector processing. In Proceedings of the IEEE/ACM International Symposium on Design, Automation and Test in Europe (DATE). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chen, X., Kang, J., Lin, S., and Akella, V. 2011. Memory system optimization for FPGA-Based implementation of quasi-cyclic LDPC codes decoders. IEEE Trans. Circ. Syst. I: Regular Papers 58, 1, 98--111.Google ScholarGoogle ScholarCross RefCross Ref
  8. Chen, Y. and Parhi, K. 2004. Overlapped message passing for quasi-cyclic low-density parity check codes. IEEE Trans. Circ. Syst. I 51, 6, 1106--1113.Google ScholarGoogle Scholar
  9. Cheng, L., Li, F., Lin, Y., Wong, P., and He, L. 2007. Device and architecture cooptimization for FPGA power reduction. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 26, 7, 1211--1221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Choi, S., Scrofano, R., Prasanna, V. K., and Jang, J.-W. 2003. Energy-Efficient signal processing using FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 225--234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fossorier, M., Mihaljevic, M., and Imai, H. 1999. Reduced complexity iterative decoding of low-density parity check codes based on belief propagation. IEEE Trans. Comm. 47, 5, 673--680.Google ScholarGoogle ScholarCross RefCross Ref
  12. Gallager., R. 1962. Low-Density parity-check codes. IEEE Trans. Inf. Theory 8, 1, 21--28.Google ScholarGoogle ScholarCross RefCross Ref
  13. Gayasen, A., Tsai, Y., Vijaykrishnan, N., Kandemir, M., Irwin, M., and Tuan, T. 2004. Reducing leakage energy in FPGAs using region-constrained placement. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 51--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Gupta, S., Anderson, J., Farragher, L., and Wang, Q. 2007. CAD techniques for power optimization in Virtex-5 FPGAs. In Proceedings of the IEEE Custom Integrated Circuits Conference. 85--88.Google ScholarGoogle Scholar
  15. Huang, J., Parris, M., Lee, J., and Demara, R. 2009. Scalable FPGA-based architecture for DCT computation using dynamic partial reconfiguration. ACM Trans. Embed. Comput. Syst. 9, 1, 1--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jang, J.-W., Choi, S., and Prasanna, V. 2005. Energy- and time-efficient matrix multiplication on FPGAs. IEEE Trans. VLSI Syst. 13, 11, 1305--1319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lamoureux, J., Lemieux, G., and Wilton, S. 2008. GlitchLess: Dynamic power minimization in FPGAs through edge alignment and glitch filtering. IEEE Trans. VLSI Syst. 16, 11, 1521--1534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Li, F., Chen, D., He, L., and Cong, J. 2003. Architecture evaluation for power-efficient FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 175--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Li, F., Lin, Y., and He, L. 2004. FPGA power reduction using configurable dual-Vdd. In Proceedings of the ACM/IEEE Design Automation Conference. 735--740. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Li, Z., Chen, L., Zeng, L., Lin, S., and Fong, W. 2005. Efficient encoding of quasi-cyclic low-density parity-check codes. IEEE Trans. Comm. 53, 11, 1973--1973.Google ScholarGoogle ScholarCross RefCross Ref
  21. Lin, M., Gamal, A., Lu, Y.-C., and Wong, S. 2006. Performance benefits of monolithically stacked 3d-fpga. In Proceedings of the ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays (FPGA'06). ACM, New York, 113--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lin, Y., Li, F., and He, L. 2005. Power modeling and architecture evaluation for FPGA with novel circuits for Vdd programmability. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 199--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Liu, S., Pittman, R. N., and Forin, A. 2010. Energy reduction with run-time partial reconfiguration. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 292--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lodi, A., Ciccarelli, L., Loparco, D., Canegallo, R., and Guerrieri, R. 2005. Low leakage design of LUT-based FPGAs. In Proceedings of ESSCIRC. 153--156.Google ScholarGoogle Scholar
  25. McKeown, S., Woods, R., and McAllister, J. 2008. Power efficient dsp datapath configuration methodology for fpga. In Proceedings of the International Conference on Field Programmable Logic and Applications. 515--518.Google ScholarGoogle ScholarCross RefCross Ref
  26. Megalingam, R., Vineeth Sarma, V., Venkat Krishnan, B., Mithun, M., and Srikumar, R. 2009. Novel low power, high speed hardware implementation of 1D DCT/IDCT using Xilinx FPGA. In Proceedings of the International Conference on Computer Technology and Development. IEEE, 530--534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Oliver, J. and Akella, V. 2003. Improving dsp performance with a small amount of field programmable logic. In Proceedings of the International Conference on Field Programmable Logic and Applications. 520--532.Google ScholarGoogle Scholar
  28. Pillai, L. 2002. Video compression using DCT. XAPP 610.Google ScholarGoogle Scholar
  29. Russell, T., Vaughn, B., David, N., and Thiagaraja, G. 2006. Power-Aware RAM mapping for FPGA embedded memory blocks. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 189--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Shang, L., Kaviani, A., and Bathala, K. 2002. Dynamic power consumption in Virtex-II FPGA family. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 157--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sun, L., Song, H., Keirn, Z., and Kumar, B. 2006. Field programmable gate array (FPGA) for iterative code evaluation. IEEE Trans. Magnet. 42, 2, 226--231.Google ScholarGoogle ScholarCross RefCross Ref
  32. Tai, Y. 2010. Error Control Coding for MLC Flash Memories. In Proceedings of the Flash Memory Summit. http://www.flashmemorysummit.com/.Google ScholarGoogle Scholar
  33. Tinmaung, K. O., Howland, D., and Tessier, R. 2007. Power-Aware FPGA logic synthesis using binary decision diagrams. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 148--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tuan, T., Kao, S., Rahman, A., Das, S., and Trimberger, S. 2006. A 90nm low-power FPGA for battery-powered applications. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 3--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Wang, Q., Gupta, S., and Anderson, J. H. 2009. Clock power reduction for virtex-5 FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 13--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Wang, Z. and Cui, Z. 2007. Low-Complexity high-speed decoder design for quasi-cyclic LDPC codes. IEEE Trans. VLSI Syst. 15, 1, 104--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wey, C., Shieh, M., and Lin, S. 2008. Algorithms of finding the first two minimum values and their hardware implementation. IEEE Trans. Circ. Syst. I 55, 11, 3430--3437.Google ScholarGoogle Scholar
  38. Xilinx. 2010. Xilinx power estimator. http://www.xilinx.com/ise/power_tools/license_virtex6.htm.Google ScholarGoogle Scholar
  39. Yang, L., Liu, H., and Shi, C.-J. 2006. Code construction and FPGA implementation of a low-error-floor multi-rate low-density parity-check code decoder. IEEE Trans. Circ. Syst. I 53, 4, 892--904.Google ScholarGoogle Scholar
  40. Zhang, T. and Parhi, K. 2004. Joint (3,k)-regular LDPC code and decoder/encoder design. IEEE Trans. Signal Process. 52, 4, 1065--1079. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploiting data-level parallelism for energy-efficient implementation of LDPC decoders and DCT on an FPGA

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Reconfigurable Technology and Systems
            ACM Transactions on Reconfigurable Technology and Systems  Volume 4, Issue 4
            December 2011
            179 pages
            ISSN:1936-7406
            EISSN:1936-7414
            DOI:10.1145/2068716
            Issue’s Table of Contents

            Copyright © 2011 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 28 December 2011
            • Accepted: 1 January 2011
            • Revised: 1 November 2010
            • Received: 1 August 2010
            Published in trets Volume 4, Issue 4

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!