skip to main content
research-article

Parallel Reconfigurable Computing-Based Mapping Algorithm for Motion Estimation in Advanced Video Coding

Published:01 August 2012Publication History
Skip Abstract Section

Abstract

Computational load of motion estimation in advanced video coding (AVC) standard is significantly high and even worse for HDTV and super-resolution sequences. In this article, a video processing algorithm is dynamically mapped onto a new parallel reconfigurable computing (PRC) architecture which consists of multiple dynamic reconfigurable computing (DRC) units. First, we construct a directed acyclic graph (DAG) to represent video coding algorithms in which motion estimation is the focus. A novel parallel partition approach is then proposed to map motion estimation DAG onto the multiple DRC units in a PRC system. This partitioning algorithm is capable of design optimization of parallel processing reconfigurable systems for a given number of processing elements in different search ranges. This speeds up the video processing with minimum sacrifice.

References

  1. Bjontegarrd, G. 2001. Calculation of average psnr difference between rd-curve. In Proceedings of the 13th VCEG Meeting.Google ScholarGoogle Scholar
  2. Chen, L. F. and Lai, Y. K. 2004. VLSI architecture of the reconfigurable computing engine for digital signal processing applications. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS’04). 937--940.Google ScholarGoogle Scholar
  3. Chen, Y.-K., Chhugani, J., Hughes, C. J., Kim, D., Kumar, S., Lee, V., Lin, A., Nguyen, A. D., Sifakis, E., and Smelyanskiy, M. 2007. High-Performance physical simulations on next-generation architecture with many cores. Intel. Techn. J. 1, 3, 251--262.Google ScholarGoogle Scholar
  4. Chen, Z., Song, Y., Ikenaga, T., and Goto, S. 2008. Adaptive search range algorithm for variable block size motion estimation in H.264/AVC. IEICE Fundam. E91-A, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Fujii, T., Furuta, K., Motomura, M., Nomura, M., Mizuno, M., Anjo, K., Wakabayashi, K., Hirota, Y., Nakazawa, Y., Ito, H., and Yamashina, M. 1999. A dynamically reconfigurable logic engine with a multiconfiguration/multimode unified cell architecture. In Proceedings of the IEEE International Solid-State Circuits Conference. 364--365.Google ScholarGoogle Scholar
  6. Jiang, Y. C. and Wang, J. F. 2007. Temporal partitioning data flow graphs for dynamically reconfigurable computing. IEEE Trans. VLSI 15, 12, 1351--1361. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Joint Video Team of ISO/IEC. 2012. H.264/14496-10 AVC Reference Software Manual. www.scribd.com/doc/48732371/JM-Reference-Software-Manual-JVT-AE010.Google ScholarGoogle Scholar
  8. Kiefer, K. 2007. Motion estimation with Intel streaming simd extensions 4 (Intel sse4). Intel Software Solutions Group.Google ScholarGoogle Scholar
  9. Krishna, R., Mahlke, S., and Austin, T. 2004. Memory system design space exploration for low-power, real time speech recognition. In Proceedings of the IEEE International Conference on Hardware/Software Codesign and System Synthesis. 140--145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kuhn, P. 1999. Algorithm, Complexity Analysis and VLSI Architecture for MPEG-4 Motion Estimation. Kluwer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Lai, Y. K., Chen, L. F., and Chen, J. C. 2006. A reconfigurable computing processor core for multimedia system-on-chip applications. Japan. J. Appl. Phys. 45, 4B, 3336--3342.Google ScholarGoogle ScholarCross RefCross Ref
  12. Li, E., Li, W., Tong, X., Li, J., Chen, Y., Wang, T., Wang, P. P., Hu, W., Du, Y., Zhang, Y., and Chen, Y.-K. 2008. Accelerating video-mining application using many small general-purpose cores. IEEE Micro 28, 5, 8--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Maestre, R., Kurdahi, F. J., Fernandez, M., Hermida, R., Bagherzadeh, N., and Singh, H. 2001. Kernel scheduling techniques for efficient solution space exploration in reconfigurable computing. J. Syst. Archit. 47, 277--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Mehdipour, F., Saheb Zamani, M., and Sedighi, M. 2006. An integrated temporal partitioning and physical design framework for static compilation of reconfigurable computing systems. Int. J. Microprocess. Microsyst. 30, 1, 52--62.Google ScholarGoogle ScholarCross RefCross Ref
  15. Motomura, M., Aimoto, Y., Shibayama, A., Yabe, Y., and Yamashina, M. 1997. An embedded dram-fpga chip with instantaneous logic reconfiguration. In Proceedings of the Symposium on VLSI Circuits. 55--56.Google ScholarGoogle Scholar
  16. Schmit, H., Whelihan, D., Tsai, A., Moe, M., Levine, B., and Taylor, R. R. 2002. PipeRench: A virtualized programmable datapath in 0.18 micron technology. In Proceedings of the IEEE Custom Integrated Circuits Conference. 63--66.Google ScholarGoogle Scholar
  17. Schmit, H. 2007. Programmable pipeline fabric utilizing partially global configuration buses. US Patent 7263602, Carnegie Mellon University.Google ScholarGoogle Scholar
  18. Singh, H., Lu, G., Lee, M., Kurdahi, F. J., Bagherzadeh, N., Filho, E., and Maestre, R. 2000. MorphoSys: Case study of a reconfigurable computing system targeting multimedia applications. In Proceedings of the ACM/IEEE Design Automation Conference (DAC’00). 573--578. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Stallings, W. 2003. Computer Organization and Architecture: Designing for Performance. Pearson Education. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Vissers, K. A. 2003. Parallel processing architectures for reconfigurable systems. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibtion (DATE’03). 396--397. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Wiegand, T., Sullivan, G. J., Bjntegaard, G., and Luthra, A. 2007. Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Trans. Circ. Syst. Video Technol. 17, 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Wu, G. M., Lin, J. M., and Chang, Y. W. 2001. Generic ilp-based approaches for time-multiplexed fpga partitioning. IEEE Trans. Comput. Aided Des. 20, 10, 1266--1274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Yagi, H., Rosenstiel, W., Engblom, J., Andrews, J., Vissers, K., and Serughetti, M. 2009. The wild west: Conquest of complex hardware-dependent software design. In Proceedings of the 46th ACM/IEEE Design Automation Conference (DAC’09). 878--879. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yoshizawa, S., Miyanaga, Y., and Wada, N. 2002. A low-power vlsi design of a me-based speech recognition system. In Proceedings of the 45th Midwest Symposium on Circuits and Systems (MWSCAS’02). 489--492.Google ScholarGoogle Scholar
  25. Zhu, C., Lin, X., and Chau, L.-P. 2002. Hexagonal-Based search pattern for fast block motion estimation. IEEE Trans. Circ. Syst. Video Technol. 12, 15, 349--355. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Parallel Reconfigurable Computing-Based Mapping Algorithm for Motion Estimation in Advanced Video Coding

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!