Abstract
The embedded image processing systems like smartphones and digital cameras have tight limits on storage, computation power, network connectivity, and battery usage. These limitations make it important to ensure efficient image coding. In the article, we present a novel heap-based priority queue structure employed by an Adaptive Scanning of Wavelet Data scheme (ASWD) targeting an embedded platform. ASWD is a context modeling block implemented via priority queues in a wavelet-based image coder to reorganize the wavelet coefficients into locally stationary sequences. The architecture we propose exploits efficient use of FPGA’s on-chip dual-port memories in an adaptive manner. Innovations of index-aware system linked to each element in the queue makes the location of queue element traceable in the heap as per the requirements of the ASWD algorithm. Moreover, use of 4-port memories along with intelligent data concatenation of queue elements yielded in a cost effective enhanced memory access. The memory ports are adaptively assigned to different units during different processing phases in a manner to optimally take advantage of memory access required by that phase. The architectural innovations can also be exploited in other applications that require efficient hardware implementations of generic priority queue or classical sorting applications which sort into the index. We designed and validated the hardware on an Altera’s Stratix IV FPGA as an IP accelerator in a Nios II processor based System on Chip. We show that our architecture at 150MHz can provide 45X speedup compared to an embedded ARM Cortex-A9 processor at 666MHz targeting the throughput of 10MB/s.
- Michael Adams. 2014. JasPer Project. Retrieved from http://www.ece.uvic.ca/∼frodo/jasper/.Google Scholar
- Yuhui Bai, Syed Zahid Ahmed, and Bertrand Granado. 2013. FPGA implementation of hierarchical enumerative coding for locally stationary image source. In Field Programmable Logic and Applications. IEEE, 1--6.Google Scholar
- Ranjita Bhagwan and Bill Lin. 2000. Fast and scalable priority queue architecture for high-speed network switches. In Proceedings of INFOCOM’00, Vol. 2. IEEE, 538--547.Google Scholar
Cross Ref
- Albert Cohen, Ingrid Daubechies, and J.-C. Feauveau. 1992. Biorthogonal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics 45, 5 (1992), 485--560.Google Scholar
Cross Ref
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein, et al. 2001. Introduction to Algorithms. Vol. 2. MIT press Cambridge. Google Scholar
Digital Library
- Kaisa Haapala, Ville Lappalainen, and Timo D. Hämäläinen. 2005. Experimental parallel implementation of a wavelet-based still image encoder. Microprocessors and Microsystems 29, 4 (2005), 155--167.Google Scholar
Cross Ref
- Alain Hore and Djemel Ziou. 2010. Image quality metrics: PSNR vs. SSIM. In International Conference on Pattern Recognition (ICPR). IEEE, 2366--2369. Google Scholar
Digital Library
- Shih-Ta Hsiang. 2001. Embedded image coding using zeroblocks of subband/wavelet coefficients and context modeling. In Proceedings of the 2001 Data Compression Conference (DCC’01). IEEE, 83--92. Google Scholar
Digital Library
- A. Ioannou and M. G. H. Katevenis. 2007. Pipelined heap (priority queue) management for advanced scheduling in high-speed networks. IEEE/ACM Transactions on Networking (TON) 15, 2 (2007), 450--461. Google Scholar
Digital Library
- Kakadu. 2014. Kakadu Software. Retrieved from http://www.kakadusoftware.com.Google Scholar
- Rui Marcelino, Horácio C. Neto, and João M. P. Cardoso. 2009. A comparison of three representative hardware sorting units. In Proceedings of the Industrial Electronics Conference (IECON’09). IEEE, 2805--2810.Google Scholar
- Detlev Marpe, Heiko Schwarz, and Thomas Wiegand. 2003. Context-based adaptive binary arithmetic coding in the H. 264/AVC video compression standard. IEEE Transactions on Circuits and Systems for Video Technology 13, 7 (2003), 620--636. Google Scholar
Digital Library
- Ioannis Mavroidis. 1998. Heap Management in Hardware. Tech. Rep. FORTH-CS/TR-222. Institute of Computer Science, Crete, Greece.Google Scholar
- L. Öktem. November 1999. Hierarchical Enumerative Coding and Its Applications in Image Compression. Ph.D. Dissertation. Tampere University of Technology.Google Scholar
- Levent Öktem and Jaakko Astola. 1999. Hierarchical enumerative coding of locally stationary binary data. Electronics Letters 35, 17 (1999), 1428--1429.Google Scholar
Cross Ref
- N. Rajovic, N. Puzovic, L. Vilanova, C. Villavieja, and A. Ramirez. 2011. Energy efficient computing on. Embedded and Mobile devices. In Proceedings of the GPU Technology Conference (SC’11).Google Scholar
- Robert Sedgewick and Kevin Wayne. 2011. Algorithms (4th ed.). Addison-Wesley Professional. 308--335. Google Scholar
Digital Library
- S. D. Servetto and K. Ramhandran. Sep. 1999. Image coding based on a morphological representation of wavelet data. IEEE Transactions on Image Processing 8, 9 (Sep. 1999), 1161--1174. Google Scholar
Digital Library
- J. M. Shapiro. 1993. Embedded image coding using zerotrees of wavelet coefficients. IEEE Transactions on Signal Processing 41 (1993), 3445--3462. Google Scholar
Digital Library
- Muneyoshi Suzuki and Katsuya Minami. 2009. Concurrent heap-based network sort engine-toward enabling massive and high speed per-flow queuing. In Proceedings of ICC’09. IEEE, 1--6. Google Scholar
Digital Library
- Terasic Technologies. 2012. DE4 User Manual. http://www.terasic.com.tw/.Google Scholar
- D. Vatolin, A. Moskvin, O. Petrov, and N. Trunichkin. 2005. JPEG 2000 Image Codecs Comparison. Retrieved from http://compression.ru/video/codec_comparison/pdf/jpeg2000_codec_comparison_en.pdf.Google Scholar
- Zhou Wang and Alan C. Bovik. 2009. Mean squared error: Love it or leave it? A new look at signal fidelity measures. IEEE Signal Processing Magazine 26, 1 (2009), 98--117.Google Scholar
Cross Ref
- Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600--612. Google Scholar
Digital Library
- Wojciech M. Zabołotny. 2011. Dual port memory based heapsort implementation for fpga. In Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2011. 80080E.Google Scholar
Index Terms
ARC 2014: Towards a Fast FPGA Implementation of a Heap-Based Priority Queue for Image Coding Using a Parallel Index-Aware Tree
Recommendations
A power-efficient adaptive heapsort for fpga-based image coding application (abstract only)
FPGA '14: Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arraysThis paper presents an adaptive heap sort architecture for an image coding implementation on FPGA, which specifically addresses the issue of sorting different amount of data located in each subband during the coding. The proposed sorting architecture is ...
Performance-Area Improvement by Partial Reconfiguration for an Aerospace Remote Sensing Application
RECONFIG '11: Proceedings of the 2011 International Conference on Reconfigurable Computing and FPGAsDynamic Partial Reconfiguration (DPR) allows modification of certain parts of an FPGA while the rest of the device continues to operate and remains unaffected by the partial reprogramming. DPR for FPGA-based designs is an increasingly important feature ...
HW/SW co-design of reconfigurable hardware-based genetic algorithm in FPGAs applicable to a variety of problems
This paper describes the implementation of a reconfigurable hardware-based genetic algorithm (HGA) accelerator using the hardware-software (HW/SW) co-design methodology. This HGA is coupled with a unique TRNG that extracts random jitters from a phase ...






Comments