skip to main content
research-article

NR-CNN: Nested-Residual Guided CNN In-loop Filtering for Video Coding

Authors Info & Claims
Published:04 March 2022Publication History
Skip Abstract Section

Abstract

Recently, deep learning for video coding, such as deep predictive coding, deep transform coding, and deep in-loop filtering, has been an emerging research area. The coding gain of hybrid coding framework could be extensively promoted by the data-driven models. However, previous deep coding tools especially deep in-loop filtering mainly consider the performance improvement while pay less attention to the reliability, usability, and adaptivity of the networks. In this article, a nested-residual guided convolutional neural network (NR-CNN) structure with cascaded global shortcut and configurable residual blocks is proposed for in-loop filtering. By taking advantage of the correlation between different color components, we further extend the NR-CNN by utilizing luminance as textural and structural guidance for chrominance filtering, which significantly improves the filtering performance. To fully exploit the proposed network into codec integration, we subsequently introduce an efficient and adaptive framework consisting of an adaptive granularity optimization and a parallel inference pipeline for deep learning based filtering. The former contributes to the coding performance improvement through an adaptive decision-making based on rate-distortion analysis at various granularities. The latter reduces the running time of network inference. The extensive experimental results show the superiority of the proposed method, achieving 8.2%, 14.9%, and 13.2% BD-rate savings on average under random access (RA) configuration. Meanwhile, the proposed method also obtains better subjective quality.

REFERENCES

  1. [1] Agustsson Eirikur and Timofte Radu. 2017. Ntire 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 126135.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Chen Jie and Fan Kui. 2019. The common test conditions of AVS3. AVS-Doc, N2727.Google ScholarGoogle Scholar
  3. [3] Chen Jianle, Ye Yan, and Kim Seung Hwan. 2020. Algorithm description for versatile video coding and test model 10 (VTM’10). JVET-S2002.Google ScholarGoogle Scholar
  4. [4] Dai Yuanying, Liu Dong, and Wu Feng. 2017. A convolutional neural network approach for post-processing in HEVC intra coding. In Proceedings of the International Conference on Multimedia Modeling. Springer, 2839.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Dong Chao, Deng Yubin, Loy Chen Change, and Tang Xiaoou. 2015. Compression artifacts reduction by a deep convolutional network. In Proceedings of the IEEE International Conference on Computer Vision. 576584.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Fu Chih-Ming, Alshina Elena, Alshin Alexander, Huang Yu-Wen, Chen Ching-Yeh, Tsai Chia-Yang, Hsu Chih-Wei, Lei Shaw-Min, Park Jeong-Hoon, and Han Woo-Jin. 2012. Sample adaptive offset in the HEVC standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 17551764.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Galpin Fu, Bordes Pu, and Léannec F. Le. 2016. Adaptive clipping in JEM2.0. JVET-C0040.Google ScholarGoogle Scholar
  8. [8] Gao Wen, Ma Siwei, Duan Lingyu, Tian Yonghong, Xing Peiyin, Wang Yaowei, Wang Shanshe, Jia Huizhu, and Huang Tiejun. 2021. Digital retina: A way to make the city brain more efficient by visual coding. IEEE Transactions on Circuits and Systems for Video Technology 31, 11 (2021), 41474161.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] He Kaiming, Sun Jian, and Tang Xiaoou. 2010. Guided image filtering. In Proceedings of the European Conference on Computer Vision. Springer, 114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] He Xiaoyi, Hu Qiang, Zhang Xiaoyun, Zhang Chongyang, Lin Weiyao, and Han Xintong. 2018. Enhancing HEVC compressed videos with a partition-masked convolutional neural network. In Proceedings of the 2018 25th IEEE International Conference on Image Processing. IEEE, 216220.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Hu Yueyu, Yang Wenhan, Xia Sifeng, Cheng Wen-Huang, and Liu Jiaying. 2018. Enhanced intra prediction with recurrent neural network in video coding. In Proceedings of the 2018 Data Compression Conference. IEEE, 413413.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Huang Gao, Liu Zhuang, Maaten Laurens Van Der, and Weinberger Kilian Q.. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 47004708.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Huo Shuai, Liu Dong, Wu Feng, and Li Houqiang. 2018. Convolutional neural network-based motion compensation refinement for video coding. In Proceedings of the 2018 IEEE International Symposium on Circuits and Systems. IEEE, 14.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Jia Chuanmin, Wang Shiqi, Zhang Xinfeng, Wang Shanshe, Liu Jiaying, Pu Shiliang, and Ma Siwei. 2019. Content-aware convolutional neural network for in-loop filtering in high efficiency video coding. IEEE Transactions on Image Processing 28, 7 (2019), 3343–3356.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Jia Chuanmin, Wang Shiqi, Zhang Xinfeng, Wang Shanshe, and Ma Siwei. 2017. Spatial-temporal residue network based in-loop filter for video coding. In Proceedings of the 2017 IEEE Visual Communications and Image Processing. IEEE, 14.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Jia Wei, Li Li, Li Zhu, Zhang Xiang, and Liu Shan. 2021. Residual-guided in-loop filter using convolution neural network. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 4 (2021), 119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Jian Yunrui, Zhang Jiaqi, Luo Falei, Wang Shanshe, Ma Siwei, Li Lin, Su Yi, and Feng Yanan. 2020. An enhanced sample adaptive offset filtering method for AVS3. AVS-M5373.Google ScholarGoogle Scholar
  19. [19] Karczewicz Marta, Zhang Li, Chien Wei-Jung, and Li Xiang. 2016. Geometry transformation-based adaptive in-loop filter. In Proceedings of the 2016 Picture Coding Symposium. IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Kingma Diederik P. and Ba Jimmy. 2015. Adam: A method for stochastic optimization. In ICLR (Poster).Google ScholarGoogle Scholar
  21. [21] Kuanar Shiba, Conly Christopher, and Rao K. R.. 2018. Deep learning based HEVC in-loop filtering for decoder quality enhancement. In Proceedings of the 2018 Picture Coding Symposium. IEEE, 164168.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Kuanar Shiba, Rao K. R., Bilas Monalisa, and Bredow Jonathan. 2019. Adaptive CU mode selection in HEVC intra prediction: A deep learning approach. Circuits, Systems, and Signal Processing 38, 11 (2019), 50815102.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Kuanar Shiba, Rao K. R., Conly Christopher, and Gorey Ninad. 2021. Deep learning based HEVC in-loop filter and noise reduction. Signal Processing: Image Communication 99 (2021), 116409.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Kuo Che-Wei, Xiu Xiaoyu, Chen Wei, Chen Yi-Wen, Jhu Hong-Jheng, and Wang Xianglin. 2020. Cross-component sample adaptive offset. AVS-M5800.Google ScholarGoogle Scholar
  25. [25] Li Daowen and Yu Lu. 2019. An in-loop filter based on low-complexity CNN using residuals in intra video coding. In Proceedings of the 2019 IEEE International Symposium on Circuits and Systems. IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Li Jiahao, Li Bin, Xu Jizheng, Xiong Ruiqin, and Gao Wen. 2018. Fully connected network-based intra prediction for image coding. IEEE Transactions on Image Processing 27, 7 (2018), 32363247.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Li Junru, Wang Meng, Zhang Li, Zhang Kai, Wang Shiqi, Wang Shanshe, Ma Siwei, and Gao Wen. 2020. Sub-sampled cross-component prediction for chroma component coding. In Proceedings of the 2020 Data Compression Conference. IEEE, 203212.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Lin Kai, Jia Chuanmin, Zhao Zhenghui, Wang Li, Wang Shanshe, Ma Siwei, and Gao Wen. 2019. Residual in residual based convolutional neural network in-loop filter for AVS3. In Proceedings of the 2019 Picture Coding Symposium. IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Ma Di, Zhang Fan, and Bull David R. 2021. BVI-DVC: A training database for deep video compression. IEEE Transactions on Multimedia.Google ScholarGoogle Scholar
  30. [30] Ma Siwei, Huang Tiejun, Reader Cliff, and Gao Wen. 2015. AVS2? Making video coding smarter [standards in a nutshell]. IEEE Signal Processing Magazine 32, 2 (2015), 172183.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Ma Siwei, Zhang Xinfeng, Jia Chuanmin, Zhao Zhenghui, Wang Shiqi, and Wanga Shanshe. 2019. Image and video compression with neural networks: A review. IEEE Transactions on Circuits and Systems for Video Technology 30, 6 (2019), 1683–1698.Google ScholarGoogle Scholar
  32. [32] Misra Kiran, Bossen Frank, and Segall Andrew. 2019. On cross component adaptive loop filter for video compression. In Proceedings of the 2019 Picture Coding Symposium. IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Nair Vinod and Hinton Geoffrey E.. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning. 807814.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Norkin Andrey, Bjontegaard Gisle, Fuldseth Arild, Narroschke Matthias, Ikeda Masaru, Andersson Kenneth, Zhou Minhua, and Auwera Geert Van der. 2012. HEVC deblocking filter. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 17461754.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Pu Fangjun, Lu Taoran, Peng Yin, Husak Walt, McCarthy Sean, and Chen Tao. 2018. In-loop reshaping for SDR video. JVET-K0309.Google ScholarGoogle Scholar
  36. [36] Song Xiaodan, Yao Jiabao, Zhou Lulu, Wang Li, Wu Xiaoyang, Xie Di, and Pu Shiliang. 2018. A practical convolutional neural network as loop filter for intra frame. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 11331137.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Sullivan Gary J., Ohm Jens-Rainer, Han Woo-Jin, and Wiegand Thomas. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 16491668.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Tsai Chia-Yang, Chen Ching-Yeh, Yamakage Tomoo, Chong In Suk, Huang Yu-Wen, Fu Chih-Ming, Itoh Takayuki, Watanabe Takashi, Chujoh Takeshi, Karczewicz Marta, et al. 2013. Adaptive loop filtering for video coding. IEEE Journal of Selected Topics in Signal Processing 7, 6 (2013), 934945.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Wang Meng, Li Junru, Zhang Li, Zhang Kai, Liu Hongbin, Wang Shiqi, Kwong Sam, and Ma Siwei. 2019. Extended coding unit partitioning for future video coding. IEEE Transactions on Image Processing 29, (2019), 2931–2946Google ScholarGoogle Scholar
  40. [40] Wang Yingbin, Zhu Han, Li Yiming, Chen Zhenzhong, and Liu Shan. 2018. Dense residual convolutional neural network based in-loop filter for HEVC. In Proceedings of the 2018 IEEE Visual Communications and Image Processing. IEEE, 14.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Wennersten Per, Strom Jacob, Wang Ying, Andersson Kenneth, Sjoberg Rickard, and Enhorn Jack. 2017. Bilateral filtering for video coding. In Proceedings of the 2017 IEEE Visual Communications and Image Processing. IEEE, 14.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Wiegand Thomas, Schwarz Heiko, Joch Anthony, Kossentini Faouzi, and Sullivan Gary J.. 2003. Rate-constrained coder control and comparison of video coding standards. IEEE Transactions on Circuits and Systems for Video Technology 13, 7 (2003), 688703.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Xia Sifeng, Yang Wenhan, Hu Yueyu, Cheng Wen-Huang, and Liu Jiaying. 2019. Switch mode based deep fractional interpolation in video coding. In Proceedings of the 2019 IEEE International Symposium on Circuits and Systems. IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Xu Mai, Li Tianyi, Wang Zulin, Deng Xin, Yang Ren, and Guan Zhenyu. 2018. Reducing complexity of HEVC: A deep learning approach. IEEE Transactions on Image Processing 27, 10 (2018), 50445059.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Zhang Jiaqi, Jia Chuanmin, Lei Meng, Wang Shanshe, Ma Siwei, and Gao Wen. 2019. Recent development of AVS video coding standard: AVS3. In Proceedings of the 2019 Picture Coding Symposium. IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Zhang Jian, Jia Chuanmin, Ma Siwei, and Gao Wen. 2015. Non-local structure-based filter for video coding. In Proceedings of the 2015 IEEE International Symposium on Multimedia. IEEE, 301306.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Zhang Kai, Chen Jianle, Zhang Li, Li Xiang, and Karczewicz Marta. 2018. Enhanced cross-component linear model for chroma intra-prediction in video coding. IEEE Transactions on Image Processing 27, 8 (2018), 39833997.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Zhang Kai, Zuo Wangmeng, Chen Yunjin, Meng Deyu, and Zhang Lei. 2017. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing 26, 7 (2017), 31423155.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Zhang Xinfeng, Xiong Ruiqin, Ma Siwei, and Gao Wen. 2012. Adaptive loop filter with temporal prediction. In Proceedings of the 2012 Picture Coding Symposium. IEEE, 437440.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Zhang Yongbing, Shen Tao, Ji Xiangyang, Zhang Yun, Xiong Ruiqin, and Dai Qionghai. 2018. Residual highway convolutional neural networks for in-loop filtering in HEVC. IEEE Transactions on Image Processing 27, 8 (2018), 38273841.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Zhao Lei, Wang Shiqi, Zhang Xinfeng, Wang Shanshe, Ma Siwei, and Gao Wen. 2019. Enhanced motion-compensated video coding with deep virtual reference frame generation. IEEE Transactions on Image Processing 28, 10 (2019), 48324844.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Zhao Zhenghui, Wang Shiqi, Wang Shanshe, Zhang Xinfeng, Ma Siwei, and Yang Jiansheng. 2018. Enhanced bi-prediction with convolutional neural network for high efficiency video coding. IEEE Transactions on Circuits and Systems for Video Technology 29, 11 (2018), 3291–3301.Google ScholarGoogle Scholar
  53. [53] Zhu Han, Xu Xiaozhong, and Liu Shan. 2020. Residual convolutional neural network based in-loop filter with intra and inter frames processed respectively for Avs3. In Proceedings of the 2020 IEEE International Conference on Multimedia & Expo Workshops. IEEE, 16.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. NR-CNN: Nested-Residual Guided CNN In-loop Filtering for Video Coding

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 4
      November 2022
      497 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3514185
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 March 2022
      • Accepted: 1 November 2021
      • Revised: 1 October 2021
      • Received: 1 June 2021
      Published in tomm Volume 18, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)433
      • Downloads (Last 6 weeks)20

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!