skip to main content
research-article

Residual-guided In-loop Filter Using Convolution Neural Network

Authors Info & Claims
Published:12 November 2021Publication History
Skip Abstract Section

Abstract

The block-based coding structure in the hybrid video coding framework inevitably introduces compression artifacts such as blocking, ringing, and so on. To compensate for those artifacts, extensive filtering techniques were proposed in the loop of video codecs, which are capable of boosting the subjective and objective qualities of reconstructed videos. Recently, neural network-based filters were presented with the power of deep learning from a large magnitude of data. Though the coding efficiency has been improved from traditional methods in High-Efficiency Video Coding (HEVC), the rich features and information generated by the compression pipeline have not been fully utilized in the design of neural networks. Therefore, in this article, we propose the Residual-Reconstruction-based Convolutional Neural Network (RRNet) to further improve the coding efficiency to its full extent, where the compression features induced from bitstream in form of prediction residual are fed into the network as an additional input to the reconstructed frame. In essence, the residual signal can provide valuable information about block partitions and can aid reconstruction of edge and texture regions in a picture. Thus, more adaptive parameters can be trained to handle different texture characteristics. The experimental results show that our proposed RRNet approach presents significant BD-rate savings compared to HEVC and the state-of-the-art CNN-based schemes, indicating that residual signal plays a significant role in enhancing video frame reconstruction.

REFERENCES

  1. [1] Agustsson Eirikur and Timofte Radu. 2017. Ntire 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 126135.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Bellard Fabrice. 2019. FFmpeg software, A complete, cross-platform solution to record, convert, and stream audio and video. Retrieved from http://ffmpeg.org/.Google ScholarGoogle Scholar
  3. [3] Bjontegaard Gisle. 2001. Calculation of average PSNR differences between RD-curves. Document VCEG-M33, Austin, Texas.Google ScholarGoogle Scholar
  4. [4] Bross Benjamin, Chen Jianle, and Liu Shan. 2019. Versatile video coding (draft 4). Document ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 JVET-M1001-v6, Marrakech, MA.Google ScholarGoogle Scholar
  5. [5] Chen Xi, Kingma Diederik P., Salimans Tim, Duan Yan, Dhariwal Prafulla, Schulman John, Sutskever Ilya, and Abbeel Pieter. 2016. Variational lossy autoencoder. arXiv preprint arXiv:1611.02731 (2016).Google ScholarGoogle Scholar
  6. [6] Cheng Jian, Wang Pei-song, Li Gang, Hu Qing-hao, and Lu Han-qing. 2018. Recent advances in efficient computation of deep convolutional neural networks. Front. Inf. Technol. Electron. Eng. 19 (2018), 6477.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Cheng Yu, Wang Duo, Zhou Pan, and Zhang Tao. 2017. A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282 (2017).Google ScholarGoogle Scholar
  8. [8] Cheng Yu, Wang Duo, Zhou Pan, and Zhang Tao. 2018. Model compression and acceleration for deep neural networks: The principles, progress, and challenges. IEEE Sig. Process. Mag. 35, 1 (2018), 126136.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Chien W.-J. and Karczewicz M.. 2009. Adaptive filter based on combination of sum-modified Laplacian filter indexing and quadtree partitioning. ITU-T/ISO/IEC JCT-VC Document VCEG-AL27.Google ScholarGoogle Scholar
  10. [10] Dai Yuanying, Liu Dong, and Wu Feng. 2017. A convolutional neural network approach for post-processing in HEVC intra coding. In Proceedings of the IEEE International Conference on Multimedia Modeling. 2839.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Dong Chao, Deng Yubin, Loy Chen Change, and Tang Xiaoou. 2015. Compression artifacts reduction by a deep convolutional network. In Proceedings of the IEEE International Conference on Computer Vision. 576584. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Fu Chih-Ming, Alshina Elena, Alshin Alexander, Huang Yu-Wen, Chen Ching-Yeh, Tsai Chia-Yang, Hsu Chih-Wei, Lei Shaw-Min, Park Jeong-Hoon, and Han Woo-Jin. 2012. Sample adaptive offset in the HEVC standard. IEEE Trans. Circ. Syst. Vid. Technol. 22, 12 (2012), 17551764. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Fu C.-M., Chen C.-Y., and Huang Y.-W.. 2010. TE10 subtest 3: Quadtree-based adaptive offset. ITU-T/ISO/IEC JCT-VC Document JCTVC-C147.Google ScholarGoogle Scholar
  14. [14] Fu C.-M., Chen C.-Y., and Huang Y.-W.. 2011. CE8 subset 3: Picture quadtree adaptive offset. ITU-T/ISO/IEC JCT-VC Document JCTVC-D122.Google ScholarGoogle Scholar
  15. [15] Fu C.-M., Chen C.-Y., and Tsai C.-Y.. 2011. CE13: Sample adaptive offset with LCU-independent decoding. ITU-T/ISO/IEC JCT-VC Document JCTVC-E049.Google ScholarGoogle Scholar
  16. [16] Han Qinglong, Zhang Renqi, Cham Wai-Kuen, and Liu Yu. 2014. Quadtree-based non-local Kuan’s filtering in video compression. J. Vis. Commun. Image Repres. 25, 5 (2014), 10441055.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2015. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In Proceedings of the IEEE International Conference on Computer Vision. 10261034. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] He Xiaoyi, Hu Qiang, Zhang Xiaoyun, Zhang Chongyang, Lin Weiyao, and Han Xintong. 2018. Enhancing HEVC compressed videos with a partition-masked convolutional neural network. In Proceedings of the 25th IEEE International Conference on Image Processing (ICIP). 216220.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Hinton Geoffrey E. and Salakhutdinov Ruslan R.. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504507.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Huang Y.-W., Fu C.-M., and Chen C.-Y.. 2010. In-loop adaptive restoration. ITU-T/ISO/IEC JCT-VC Document JCTVC-B077.Google ScholarGoogle Scholar
  22. [22] Ignatov Andrey, Timofte Radu et al. 2019. PIRM challenge on perceptual image enhancement on smartphones: Report. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Jia Chuanmin, Wang Shiqi, Zhang Xinfeng, Wang Shanshe, Liu Jiaying, Pu Shiliang, and Ma Siwei. 2019. Content-aware convolutional neural network for in-loop filtering in high efficiency video coding. IEEE Trans. Image Process. 28, 7 (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Jia Chuanmin, Wang Shiqi, Zhang Xinfeng, Wang Shanshe, and Ma Siwei. 2017. Spatial-temporal residue network based in-loop filter for video coding. In Proceedings of the IEEE Visual Communications and Image Processing Conference (VCIP). 14.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Jia Wei, Li Li, Li Zhu, Zhang Xiang, and Liu Shan. 2020. Residual guided deblocking with deep learning. In Proceedings of the IEEE International Conference on Image Processing (ICIP).Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Jin Z., Iqbal M. Z., Bobkov D., Zou W., Li X., and Steinbach E.. 2020. A flexible deep CNN framework for image restoration. IEEE Trans. Multimedia 22, 4 (2020), 10551068.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Kang L., Hsu C., Zhuang B., Lin C., and Yeh C.. 2015. Learning-based joint super-resolution and deblocking for a highly compressed image. IEEE Trans. Multimedia 17, 7 (2015), 921934.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Kingma Diederik P. and Ba Jimmy. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  29. [29] Li T., He X., Qing L., Teng Q., and Chen H.. 2018. An iterative framework of cascaded deblocking and superresolution for compressed images. IEEE Trans. Multimedia 20, 6 (2018), 13051320.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Li Yiming, Liu Shan, and Kawamura Kei. 2019. Methodology and reporting template for neural network coding tool testing. Document ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 JVET-M1006-v1, Marrakech, MA.Google ScholarGoogle Scholar
  31. [31] Lim Bee, Son Sanghyun, Kim Heewon, Nah Seungjun, and Lee Kyoung Mu. 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 136144.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Lim Hyungjun and Park HyunWook. 2011. A ringing-artifact reduction method for block-DCT-based image resizing. IEEE Trans. Circ. Syst. Vid. Technol. 21, 7 (2011), 879889. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Lin Weiyao, He Xiaoyi, Han Xintong, Liu Dong, See John, Zou Junni, Xiong Hongkai, and Wu Feng. 2019. Partition-aware adaptive switching neural networks for post-processing in HEVC. IEEE Trans. Multimedia 22, 11 (2019).Google ScholarGoogle Scholar
  34. [34] List Peter, Joch Anthony, Lainema Jani, Bjontegaard Gisle, and Karczewicz Marta. 2003. Adaptive deblocking filter. IEEE Trans. Circ. Syst. Vid. Technol. 13, 7 (2003), 614619. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Liu Tsu-Ming, Lee Wen-Ping, and Lee Chen-Yi. 2007. An in/post-loop deblocking filter with hybrid filtering schedule. IEEE Trans. Circ. Syst. Vid. Technol. 17, 7 (2007), 937943. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Lu Guo, Ouyang Wanli, Xu Dong, Zhang Xiaoyun, Gao Zhiyong, and Sun Ming-Ting. 2018. Deep Kalman filtering network for video compression artifact reduction. In Proceedings of the European Conference on Computer Vision (ECCV). 568584.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Ma Siwei, Zhang Xinfeng, Zhang Jian, Jia Chuanmin, Wang Shiqi, and Gao Wen. 2016. Nonlocal in-loop filter: The way toward next-generation video coding? IEEE MultiMedia 23, 2 (2016), 1626.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] McCann Ken, Han Woo-Jin, and Kim Il-Koo. 2010. Samsung’s response to the call for proposals on video compression technology. ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 JCT-VC Document JCTVC-A124, Dresden, DE.Google ScholarGoogle Scholar
  39. [39] Norkin Andrey, Bjontegaard Gisle, Fuldseth Arild, Narroschke Matthias, Ikeda Masaru, Andersson Kenneth, Zhou Minhua, and Auwera Geert Van der. 2012. HEVC deblocking filter. IEEE Trans. Circ. Syst. Vid. Technol. 22, 12 (2012), 17461754. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Park Woon-Sung and Kim Munchurl. 2016. CNN-based in-loop filtering for coding efficiency improvement. In Proceedings of the IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP). 15.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Ronneberger Olaf, Fischer Philipp, and Brox Thomas. 2015. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-assisted Intervention. 234241.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Sharman Karl and Suehring Karsten. 2018. Common test conditions. Document ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 JCTVC-AE1100, San Diego, US.Google ScholarGoogle Scholar
  43. [43] Sullivan Gary J., Ohm Jens-Rainer, Han Woo-Jin, and Wiegand Thomas. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circ. Syst. Vid. Technol. (2012), 16491668. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Tao Xin, Gao Hongyun, Liao Renjie, Wang Jue, and Jia Jiaya. 2017. Detail-revealing deep video super-resolution. In Proceedings of the IEEE International Conference on Computer Vision. 44724480.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Tsai Chia-Yang, Chen Ching-Yeh, Yamakage Tomoo, Chong In Suk, Huang Yu-Wen, Fu Chih-Ming, Itoh Takayuki, Watanabe Takashi, Chujoh Takeshi, Karczewicz Marta et al. 2013. Adaptive loop filtering for video coding. IEEE J. Select. Topics Sig. Process. 7, 6 (2013), 934945.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Veit Andreas, Wilber Michael J., and Belongie Serge. 2016. Residual networks behave like ensembles of relatively shallow networks. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 550558. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Wackerly Dennis, Mendenhall William, and Scheaffer Richard L.. 2014. Mathematical Statistics with Applications. Cengage Learning.Google ScholarGoogle Scholar
  48. [48] Wang Tingting, Chen Mingjin, and Chao Hongyang. 2017. A novel deep learning-based method of improving coding efficiency from the decoder-end for HEVC. In Proceedings of the Data Compression Conference (DCC). 410419.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Wang Yingbin, Zhu Han, Li Yiming, Chen Zhenzhong, and Liu Shan. 2018. Dense residual convolutional neural network based in-loop filter for HEVC. In Proceedings of the IEEE Visual Communications and Image Processing (VCIP). 14.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Wang Zhangyang, Liu Ding, Chang Shiyu, Ling Qing, Yang Yingzhen, and Huang Thomas S.. 2016. D3: Deep dual-domain based fast restoration of JPEG-compressed images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 27642772.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Wiegand Thomas, Sullivan Gary J., Bjontegaard Gisle, and Luthra Ajay. 2003. Overview of the H. 264/AVC video coding standard. IEEE Trans. Circ. Syst. Vid. Technol. 13, 7 (2003), 560576. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Xue Tianfan, Chen Baian, Wu Jiajun, Wei Donglai, and Freeman William T.. 2017. Video enhancement with task-oriented flow. Int. J. Comput. Vis. 127, 8 (2017), 120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Yang Ren, Xu Mai, Liu Tie, Wang Zulin, and Guan Zhenyu. 2018. Enhancing quality for HEVC compressed videos. IEEE Trans. Circ. Syst. Vid. Technol. 29, 7 (2018), 2039–2054.Google ScholarGoogle Scholar
  54. [54] Yang Ren, Xu Mai, and Wang Zulin. 2017. Decoder-side HEVC quality enhancement with scalable convolutional neural network. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME). 817822.Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Zeiler Matthew D., Krishnan Dilip, Taylor Graham W., and Fergus Robert. 2010. Deconvolutional networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Zhang Xinfeng, Xiong Ruiqin, Lin Weisi, Zhang Jian, Wang Shiqi, Ma Siwei, and Gao Wen. 2016. Low-rank-based nonlocal adaptive loop filter for high-efficiency video compression. IEEE Trans. Circ. Syst. Vid. Technol. 27, 10 (2016), 21772188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Zhang Yongbing, Shen Tao, Ji Xiangyang, Zhang Yun, Xiong Ruiqin, and Dai Qionghai. 2018. Residual highway convolutional neural networks for in-loop filtering in HEVC. IEEE Trans. Image Process. 27, 8 (2018), 38273841.Google ScholarGoogle ScholarCross RefCross Ref
  58. [58] Zhang Y., Yan C., Dai F., and Ma Y.. 2012. Efficient parallel framework for H.264/AVC deblocking filter on many-core platform. IEEE Trans. Multimedia 14, 3 (2012), 510524. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Residual-guided In-loop Filter Using Convolution Neural Network

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 17, Issue 4
      November 2021
      529 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3492437
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 November 2021
      • Accepted: 1 April 2021
      • Revised: 1 February 2021
      • Received: 1 August 2020
      Published in tomm Volume 17, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!