skip to main content
research-article

Accelerating Transform Algorithm Implementation for Efficient Intra Coding of 8K UHD Videos

Authors Info & Claims
Published:04 March 2022Publication History
Skip Abstract Section

Abstract

Real-time ultra-high-definition (UHD) video applications have attracted much attention, where the encoder side urgently demands the high-throughput two-dimensional (2D) transform hardware implementation for the latest video coding standards. This article proposes an effective acceleration method for transform algorithm in UHD intra coding based on the third generation of audio video coding standard (AVS3). First, by conducting detailed statistical analysis, we devise an efficient hardware-friendly transform algorithm that can reduce running cycles and resource consumption remarkably. Second, to implement multiplierless computation for saving resources and power, a series of shift-and-add unit (SAU) hardwares are investigated to have much less adoptions of shifters and adders than the existing methods. Third, different types of hardware acceleration methods, including calculation pipelining, logical-loop unrolling, and module-level parallelism, are designed to efficaciously support the data-intensive high frame-rate 8K UHD video coding. Finally, due to the scarcity of 8K video sources, we also provide a new dataset for the performance verification. Experimental results demonstrate that our proposed method can effectively fulfill the real-time 8K intra encoding at beyond 60 fps, with very negligible loss on rate-distortion (R-D) performance, which is averagely 0.98% Bjontegaard-Delta Bit-Rate (BD-BR).

REFERENCES

  1. [1] 2021. uavs3e. Retrieved from https://github.com/uavs3/uavs3e.Google ScholarGoogle Scholar
  2. [2] Abdallah Maha, Griwodz Carsten, Chen Kuan-Ta, Simon Gwendal, Wang Pin-Chun, and Hsu Cheng-Hsin. 2018. Delay-sensitive video computing in the cloud: A survey. ACM Trans. Multimedia Comput. Commun. Appl. 14, 3s (June 2018). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Ahmed Nasir, Natarajan T., and Rao Kamisetty R.. 1974. Discrete cosine transform. IEEE Trans. Comput. 100, 1 (1974), 9093.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Atapattu Sachille, Liyanage Namitha, Menuka Nisal, Perera Ishantha, and Pasqual Ajith. 2016. Real time all intra HEVC HD encoder on FPGA. In IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP). 191195. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Bjontegaard Gisle. 2001. Calculation of average PSNR differences between RD-curves. VCEG-M33 (2001). https://www.itu.int/wftp3/av-arch/video-site/0104_Aus/VCEG-M33.doc.Google ScholarGoogle Scholar
  6. [6] Bross Benjamin, Chen Jianle, Ohm Jens-Rainer, Sullivan Gary J., and Wang Ye-Kui. 2021. Developments in international video coding standardization after AVC, with an overview of Versatile Video Coding (VVC). Proc. IEEE 109, 9 (2021), 1463–1493. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Cai Zhanyuan and Gao Wei. 2021. Efficient fast algorithm and parallel hardware architecture for intra prediction of AVS3. In IEEE International Symposium on Circuits and Systems (ISCAS). 15. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Chatterjee Subiman and Sarawadekar Kishor. 2018. An optimized architecture of HEVC core transform using real-valued DCT coefficients. IEEE Trans. Circ. Syst. II: Express Briefs 65, 12 (2018), 20522056. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Chen Zong-Yi, Jiang Hui-Yu, and Chang Pao-Chi. 2017. Efficient intra transform unit partitioning for high efficiency video coding. In IEEE International Conference on Consumer Electronics - Taiwan (ICCE-TW). 215216. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Darji A. D. and Makwana Raviraj P.. 2015. High-performance multiplierless DCT architecture for HEVC. In 19th International Symposium on VLSI Design and Test. 15. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Dong Xinchao, Shen Liquan, Yu Mei, and Yang Hao. 2021. Fast intra mode decision algorithm for versatile video coding. IEEE Trans. Multimedia 24 (2021), 400414. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Dutta Tanima and Gupta Hari Prabhat. 2017. An efficient framework for compressed domain watermarking in P frames of High-Efficiency Video Coding (HEVC)–encoded video. ACM Trans. Multimedia Comput. Commun. Appl. 13, 1 (Jan. 2017). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Fan Chih-Peng, Chang Chia-Wei, and Hsu Shun-Ji. 2014. Cost-effective hardware-sharing design of fast algorithm based multiple forward and inverse transforms for H.264/AVC, MPEG-1/2/4, AVS, and VC-1 video encoding and decoding applications. IEEE Trans. Circ. Syst. Vid. Technol. 24, 4 (2014), 714720. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Fan Chih-Peng, Fang Chia-Hao, Chang Chia-Wei, and Hsu Shun-Ji. 2011. Fast multiple inverse transforms with low-cost hardware sharing design for multistandard video decoding. IEEE Trans. Circ. Syst. II: Express Briefs 58, 8 (2011), 517521. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Fan Kui, Cai Yangang, Gao Xuesong, Chen Weiqiang, Wu Shengyuan, Wang Zhenyu, Wang Ronggang, and Gao Wen. 2020. Performance and computational complexity analysis of coding tools in AVS3. In IEEE International Conference on Multimedia Expo Workshops (ICMEW). 16. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Fan Yibo, Zeng Yixuan, Sun Heming, Katto Jiro, and Zeng Xiaoyang. 2020. A pipelined 2D transform architecture supporting mixed block sizes for the VVC standard. IEEE Trans. Circ. Syst. Vid. Technol. 30, 9 (2020), 32893295. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Gao Wei, Kwong Sam, and Jia Yuheng. 2017. Joint machine learning and game theory for rate control in high efficiency video coding. IEEE Trans. Image Process. 26, 12 (2017), 60746089. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Gao Wei, Kwong Sam, Yuan Hui, and Wang Xu. 2016. DCT coefficient distribution modeling and quality dependency analysis based frame-level bit allocation for HEVC. IEEE Trans. Circ. Syst. Vid. Technol. 26, 1 (2016), 139153. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Gao Wei, Kwong Sam, Zhou Yu, and Yuan Hui. 2016. SSIM-based game theory approach for rate-distortion optimized intra frame CTU-Level bit allocation. IEEE Trans. Multimedia 18, 6 (2016), 988999. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Gupta A. and Rao K. Raghava. 1990. A fast recursive algorithm for the discrete sine transform. IEEE Trans. Acoust, Speech Sig. Process. 38, 3 (1990), 553557.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Imen Werda, Fatma Belghith, Amna Maraoui, and Masmoudi Nouri. 2021. DCT -II transform hardware-based acceleration for VVC standard. In IEEE International Conference on Design Test of Integrated Micro Nano-Systems (DTS). 15. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Ivanov Yuri V. and Bleakley C. J.. 2010. Real-time H.264 video encoding in software with fast mode decision and dynamic complexity control. ACM Trans. Multimedia Comput. Commun. Appl. 6, 1 (Feb. 2010). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Jridi Maher and Meher Pramod Kumar. 2017. Scalable approximate DCT architectures for efficient HEVC-compliant video coding. IEEE Trans. Circ. Syst. Vid. Technol. 27, 8 (2017), 18151825. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Kahu Samruddhi, Krishnan Madhu Peringassery, Zhao Xin, and Liu Shan. 2021. Context-adaptive secondary transform for video coding. In IEEE International Conference on Image Processing (ICIP). 20392043. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Kammoun Ahmed, Hamidouche Wassim, Philipp Pierrick, Belghith Fatma, Massmoudi Nouri, and Nezan Jean-Frans. 2019. Hardware acceleration of approximate transform module for the versatile video coding standard. In 27th European Signal Processing Conference (EUSIPCO). 15. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Kammoun Ahmed, Hamidouche Wassim, Philippe Pierrick, Drges Olivier, Belghith Fatma, Masmoudi Nouri, and Nezan Jean-Frans. 2020. Forward-inverse 2D hardware implementation of approximate transform core for the VVC standard. IEEE Trans. Circ. Syst. Vid. Technol. 30, 11 (2020), 43404354. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Kong Lingchao and Dai Rui. 2018. Efficient video encoding for automatic video analysis in distributed wireless surveillance systems. ACM Trans. Multimedia Comput. Commun. Appl. 14, 3 (July 2018). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Lengwehasatit Krisda and Ortega Antonio. 2004. Scalable variable complexity approximate forward DCT. IEEE Trans. Circ. Syst. Vid. Technol. 14, 11 (2004), 12361248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Li Lingyu, Zhang Xiaoyun, and Gao Zhiyong. 2015. Efficient SIMD acceleration of DCT and IDCT for high efficiency video coding. In 4th International Conference on Multimedia Technology. CRC Press.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Liu Yao, Xiao Mengbai, Zhang Ming, Li Xin, Dong Mian, Ma Zhan, Li Zhenhua, Guo Lei, and Chen Songqing. 2016. Content-adaptive display power saving for internet video applications on mobile devices. ACM Trans. Multimedia Comput. Commun. Appl. 12, 5s (Nov. 2016). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Ma Siwei, Huang Tiejun, Reader Cliff, and Gao Wen. 2015. AVS2? Making video coding smarter [standards in a nutshell]. IEEE Sig. Process. Mag. 32, 2 (2015), 172183.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Masera Maurizio, Fiorentin Lorenzo Re, Martina Maurizio, Masera Guido, and Masala Enrico. 2015. Optimizing the transform complexity-quality tradeoff for hardware-accelerated HEVC video coding. In Conference on Design and Architectures for Signal and Image Processing (DASIP). 16. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Meher Pramod Kumar, Park Sang Yoon, Mohanty Basant Kumar, Lim Khoon Seong, and Yeo Chuohao. 2014. Efficient integer DCT architectures for HEVC. IEEE Trans. Circ. Syst. Vid. Technol. 24, 1 (2014), 168178. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Merhav Neri and Bhaskaran Vasudev. 1997. Fast algorithms for DCT-domain image downsampling and for inverse motion compensation. IEEE Trans. Circ. Syst. Vid. Technol. 7, 3 (1997), 468476.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Oppenheim Alan V.. 1999. Discrete-time Signal Processing. Pearson Education India.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Pan Zhaoqing, Lei Jianjun, Zhang Yajuan, and Wang Fu Lee. 2018. Adaptive fractional-pixel motion estimation skipped algorithm for efficient HEVC motion estimation. ACM Trans. Multimedia Comput. Commun. Appl. 14, 1 (Jan. 2018). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Pan Zhaoqing, Yi Xiaokai, Zhang Yun, Yuan Hui, Wang Fu Lee, and Kwong Sam. 2020. Frame-level bit allocation optimization based on<!–?Brk?–> video content characteristics for HEVC. ACM Trans. Multimedia Comput. Commun. Appl. 16, 1 (March 2020). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Panchani Nikuni and Pathak Ketki. 2018. Fast and multiplierless integer DCT for HEVC. In 3rd IEEE International Conference on Recent Trends in Electronics, Information Communication Technology (RTEICT). 724727. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Pao I-Ming and Sun Ming-Ting. 1999. Modeling DCT coefficients for fast video encoding. IEEE Trans. Circ. Syst. Vid. Technol. 9, 4 (1999), 608616.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Park Jongsun, Choi Jung Hwan, and Roy Kaushik. 2009. Dynamic bit-width adaptation in DCT: An approach to trade off image quality and computation energy. IEEE Trans. Very Large Scale Integ. Syst. 18, 5 (2009), 787793.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Sharp. 2021. 8C-B60A 8K Professional Camcorder. Retrieved from https://global.sharp/corporate/news/171107_2.html.Google ScholarGoogle Scholar
  42. [42] Shen Liquan, An Ping, and Feng Guorui. 2019. Low-complexity scalable extension of the high-efficiency video coding (SHVC) encoding system. ACM Trans. Multimedia Comput. Commun. Appl. 15, 2 (June 2019). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Su Guo-An and Fan Chih-Peng. 2008. Low-cost hardware-sharing architecture of fast 1-D inverse transforms for H.264/AVC and AVS applications. IEEE Trans. Circ. Syst. II: Express Briefs 55, 12 (2008), 12491253. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Sullivan Gary J., Ohm Jens-Rainer, Han Woo-Jin, and Wiegand Thomas. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circ. Syst. Vid. Technol. 22, 12 (2012), 16491668.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Sze Vivienne, Budagavi Madhukar, and Sullivan Gary J.. 2014. High Efficiency Video Coding (HEVC): Algorithms and Architectures. Springer Publishing Company, Incorporated. Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Workgroup Audio Video Coding Standard. 2019. AVS Proposal M4772: Implicit selection of transforms for intra coding. Retrieved from ftp://47.93.196.121/Public/avsdoc/1906_Chengdu/contrib/M4772.zip.Google ScholarGoogle Scholar
  47. [47] Workgroup Audio Video Coding Standard. 2021. AVS3-Part 2 (Video). Retrieved from http://avs.org.cn/AVS3_download/index.asp.Google ScholarGoogle Scholar
  48. [48] Workgroup Audio Video Coding Standard. 2021. Reference Software for AVS3: High Performance Model. Retrieved from ftp://47.93.196.121/Public/codec/video_code.Google ScholarGoogle Scholar
  49. [49] Wu Shengyuan, Wang Zhenyu, Cai Yangang, and Wang Ronggang. 2021. Fast mode decision algorithm for intra encoding of the 3rd generation audio video coding standard. In International Conference on Multimedia Modeling. 481492.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Xilinx. 2021. UltraScale Architecture Configurable Logic Block User Guide (UG574). Retrieved from https://www.xilinx.com/support/documentation/user_guides/ug574-ultrascale-clb.pdf.Google ScholarGoogle Scholar
  51. [51] Xilinx. 2021. Ultrascale FPGA Product Selection Guide. Retrieved from https://www.xilinx.com/support/documentation/selection-guides/ultrascale-fpga-product-selection-guide.pdf.Google ScholarGoogle Scholar
  52. [52] Xilinx. 2021. Virtex Ultrascale FPGA. Retrieved from https://www.xilinx.com/products/silicon-devices/fpga/virtex-ultrascale.html.Google ScholarGoogle Scholar
  53. [53] Xilinx. 2021. Vivado Simulator. Retrieved from https://www.xilinx.com/products/design-tools/vivado/simulator.html.Google ScholarGoogle Scholar
  54. [54] Yang Hao, Shen Liquan, Dong Xinchao, Ding Qing, An Ping, and Jiang Gangyi. 2020. Low-complexity CTU partition structure decision and fast intra mode decision for versatile video coding. IEEE Trans. Circ. Syst. Vid. Technol. 30, 6 (2020), 16681682. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Zhang Jiaqi, Jia Chuanmin, Lei Meng, Wang Shanshe, Ma Siwei, and Gao Wen. 2019. Recent development of AVS video coding standard: AVS3. In Picture Coding Symposium (PCS). IEEE, 15.Google ScholarGoogle Scholar
  56. [56] Zhang Yun, Kwong Sam, Zhang Guangjun, Pan Zhaoqing, Yuan Hui, and Jiang Gangyi. 2015. Low complexity HEVC INTRA coding for high-quality mobile video communication. IEEE Trans. Industr. Inform. 11, 6 (2015), 14921504. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  57. [57] Zhou Mingliang, Zhang Yongfei, Li Bo, and Hu Hai-Miao. 2017. Complexity-based intra frame rate control by jointing inter-frame correlation for high efficiency video coding. J. Vis. Commun. Image Represent. 42, C (Jan. 2017), 4664. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Zhou Mingliang, Zhang Yongfei, Li Bo, and Lin Xupeng. 2017. Complexity correlation-based CTU-level rate control with direction selection for HEVC. ACM Trans. Multimedia Comput. Commun. Appl. 13, 4 (Aug. 2017). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Accelerating Transform Algorithm Implementation for Efficient Intra Coding of 8K UHD Videos

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 4
      November 2022
      497 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3514185
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 March 2022
      • Revised: 1 December 2021
      • Accepted: 1 December 2021
      • Received: 1 July 2021
      Published in tomm Volume 18, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!