Abstract
We explore the use of Data-Level Parallelism (DLP) as a way of improving the energy efficiency and power consumption involved in running applications on an FPGA. We show that static power consumption is a significant fraction of the overall power consumption in an FPGA and that it does not change significantly even as the area required by an architecture increases, because of the dominance of interconnect in an FPGA. We show that the degree of DLP can be used in conjunction with frequency scaling to reduce the overall power consumption.
- Anderson, J. and Najm, F. 2006. Active leakage power optimization for FPGAs. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 25, 3, 423--437. Google Scholar
Digital Library
- Bates, S., Gunthorpe, L., Emre Pusane, A., Chen, Z., Zigangirov, K., and Costello, D. 2006. Decoders for low-density parity-check convolutional codes with large memory. In Proceedings of the IEEE International Symposium on Circuits and Systems.Google Scholar
- Chen, D., Cong, J., Fan, Y., and Zhang, Z. 2007. High-Level power estimation and low-power design space exploration for FPGAs. In Proceedings of the Asia and South Pacific Design Automation Conference. 529--534. Google Scholar
Digital Library
- Chen, L., Xu, J., Djurdjevic, I., and Lin, S. 2004. Near-Shannon-Limit quasi-cyclic low-density parity-check codes. IEEE Trans. Comm. 52, 7, 1038--1042.Google Scholar
Cross Ref
- Chen, X., Huang, Q., Lin, S., and Akella, V. 2009a. FPGA-Based low-complexity high-throughput tri-mode decoder for quasi-cyclic LDPC codes. In Proceedings of the Allerton Conference on Communication, Control, and Computing. Google Scholar
Digital Library
- Chen, X., Kang, J., Lin, S., and Akella, V. 2009b. Accelerating FPGA-based emulation of quasi-cyclic LDPC codes with vector processing. In Proceedings of the IEEE/ACM International Symposium on Design, Automation and Test in Europe (DATE). Google Scholar
Digital Library
- Chen, X., Kang, J., Lin, S., and Akella, V. 2011. Memory system optimization for FPGA-Based implementation of quasi-cyclic LDPC codes decoders. IEEE Trans. Circ. Syst. I: Regular Papers 58, 1, 98--111.Google Scholar
Cross Ref
- Chen, Y. and Parhi, K. 2004. Overlapped message passing for quasi-cyclic low-density parity check codes. IEEE Trans. Circ. Syst. I 51, 6, 1106--1113.Google Scholar
- Cheng, L., Li, F., Lin, Y., Wong, P., and He, L. 2007. Device and architecture cooptimization for FPGA power reduction. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 26, 7, 1211--1221. Google Scholar
Digital Library
- Choi, S., Scrofano, R., Prasanna, V. K., and Jang, J.-W. 2003. Energy-Efficient signal processing using FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 225--234. Google Scholar
Digital Library
- Fossorier, M., Mihaljevic, M., and Imai, H. 1999. Reduced complexity iterative decoding of low-density parity check codes based on belief propagation. IEEE Trans. Comm. 47, 5, 673--680.Google Scholar
Cross Ref
- Gallager., R. 1962. Low-Density parity-check codes. IEEE Trans. Inf. Theory 8, 1, 21--28.Google Scholar
Cross Ref
- Gayasen, A., Tsai, Y., Vijaykrishnan, N., Kandemir, M., Irwin, M., and Tuan, T. 2004. Reducing leakage energy in FPGAs using region-constrained placement. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 51--58. Google Scholar
Digital Library
- Gupta, S., Anderson, J., Farragher, L., and Wang, Q. 2007. CAD techniques for power optimization in Virtex-5 FPGAs. In Proceedings of the IEEE Custom Integrated Circuits Conference. 85--88.Google Scholar
- Huang, J., Parris, M., Lee, J., and Demara, R. 2009. Scalable FPGA-based architecture for DCT computation using dynamic partial reconfiguration. ACM Trans. Embed. Comput. Syst. 9, 1, 1--18. Google Scholar
Digital Library
- Jang, J.-W., Choi, S., and Prasanna, V. 2005. Energy- and time-efficient matrix multiplication on FPGAs. IEEE Trans. VLSI Syst. 13, 11, 1305--1319. Google Scholar
Digital Library
- Lamoureux, J., Lemieux, G., and Wilton, S. 2008. GlitchLess: Dynamic power minimization in FPGAs through edge alignment and glitch filtering. IEEE Trans. VLSI Syst. 16, 11, 1521--1534. Google Scholar
Digital Library
- Li, F., Chen, D., He, L., and Cong, J. 2003. Architecture evaluation for power-efficient FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 175--184. Google Scholar
Digital Library
- Li, F., Lin, Y., and He, L. 2004. FPGA power reduction using configurable dual-Vdd. In Proceedings of the ACM/IEEE Design Automation Conference. 735--740. Google Scholar
Digital Library
- Li, Z., Chen, L., Zeng, L., Lin, S., and Fong, W. 2005. Efficient encoding of quasi-cyclic low-density parity-check codes. IEEE Trans. Comm. 53, 11, 1973--1973.Google Scholar
Cross Ref
- Lin, M., Gamal, A., Lu, Y.-C., and Wong, S. 2006. Performance benefits of monolithically stacked 3d-fpga. In Proceedings of the ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays (FPGA'06). ACM, New York, 113--122. Google Scholar
Digital Library
- Lin, Y., Li, F., and He, L. 2005. Power modeling and architecture evaluation for FPGA with novel circuits for Vdd programmability. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 199--207. Google Scholar
Digital Library
- Liu, S., Pittman, R. N., and Forin, A. 2010. Energy reduction with run-time partial reconfiguration. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 292--292. Google Scholar
Digital Library
- Lodi, A., Ciccarelli, L., Loparco, D., Canegallo, R., and Guerrieri, R. 2005. Low leakage design of LUT-based FPGAs. In Proceedings of ESSCIRC. 153--156.Google Scholar
- McKeown, S., Woods, R., and McAllister, J. 2008. Power efficient dsp datapath configuration methodology for fpga. In Proceedings of the International Conference on Field Programmable Logic and Applications. 515--518.Google Scholar
Cross Ref
- Megalingam, R., Vineeth Sarma, V., Venkat Krishnan, B., Mithun, M., and Srikumar, R. 2009. Novel low power, high speed hardware implementation of 1D DCT/IDCT using Xilinx FPGA. In Proceedings of the International Conference on Computer Technology and Development. IEEE, 530--534. Google Scholar
Digital Library
- Oliver, J. and Akella, V. 2003. Improving dsp performance with a small amount of field programmable logic. In Proceedings of the International Conference on Field Programmable Logic and Applications. 520--532.Google Scholar
- Pillai, L. 2002. Video compression using DCT. XAPP 610.Google Scholar
- Russell, T., Vaughn, B., David, N., and Thiagaraja, G. 2006. Power-Aware RAM mapping for FPGA embedded memory blocks. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 189--198. Google Scholar
Digital Library
- Shang, L., Kaviani, A., and Bathala, K. 2002. Dynamic power consumption in Virtex-II FPGA family. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 157--164. Google Scholar
Digital Library
- Sun, L., Song, H., Keirn, Z., and Kumar, B. 2006. Field programmable gate array (FPGA) for iterative code evaluation. IEEE Trans. Magnet. 42, 2, 226--231.Google Scholar
Cross Ref
- Tai, Y. 2010. Error Control Coding for MLC Flash Memories. In Proceedings of the Flash Memory Summit. http://www.flashmemorysummit.com/.Google Scholar
- Tinmaung, K. O., Howland, D., and Tessier, R. 2007. Power-Aware FPGA logic synthesis using binary decision diagrams. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 148--155. Google Scholar
Digital Library
- Tuan, T., Kao, S., Rahman, A., Das, S., and Trimberger, S. 2006. A 90nm low-power FPGA for battery-powered applications. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 3--11. Google Scholar
Digital Library
- Wang, Q., Gupta, S., and Anderson, J. H. 2009. Clock power reduction for virtex-5 FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 13--22. Google Scholar
Digital Library
- Wang, Z. and Cui, Z. 2007. Low-Complexity high-speed decoder design for quasi-cyclic LDPC codes. IEEE Trans. VLSI Syst. 15, 1, 104--114. Google Scholar
Digital Library
- Wey, C., Shieh, M., and Lin, S. 2008. Algorithms of finding the first two minimum values and their hardware implementation. IEEE Trans. Circ. Syst. I 55, 11, 3430--3437.Google Scholar
- Xilinx. 2010. Xilinx power estimator. http://www.xilinx.com/ise/power_tools/license_virtex6.htm.Google Scholar
- Yang, L., Liu, H., and Shi, C.-J. 2006. Code construction and FPGA implementation of a low-error-floor multi-rate low-density parity-check code decoder. IEEE Trans. Circ. Syst. I 53, 4, 892--904.Google Scholar
- Zhang, T. and Parhi, K. 2004. Joint (3,k)-regular LDPC code and decoder/encoder design. IEEE Trans. Signal Process. 52, 4, 1065--1079. Google Scholar
Digital Library
Index Terms
Exploiting data-level parallelism for energy-efficient implementation of LDPC decoders and DCT on an FPGA
Recommendations
A low power multi-rate decoder hardware for IEEE 802.11n LDPC codes
In this paper, we present a low power multi-rate decoder hardware for low density parity check (LDPC) codes used in IEEE 802.11n wireless Local Area Network standard and we propose two novel techniques, sub-matrix reordering and differential shifting, ...
Design of High Frequency and Energy Efficient 3D Frame Buffer on 40 nm FPGA
CICN '14: Proceedings of the 2014 International Conference on Computational Intelligence and Communication NetworksIn this paper, we have proposed the design of energy efficient and high frequency frame buffer on 40 nm FPGA. The operational frequency of buffer is kept quite high of 1THz and it has been recorded that we need to optimize power considerations in order ...
IO Standard Based Green Multiplexer Design and Implementation on FPGA
CICN '13: Proceedings of the 2013 5th International Conference on Computational Intelligence and Communication NetworksIn this work, we are using Stub Series Transistor Logic (SSTL) on the simplest VLSI circuit multiplexer and analyze the power dissipation with different class. Using SSTL15 in place of SSTL2_II_DCI, there is reduction of 304mW power i.e. 76.19% power ...






Comments