Abstract
As a rapid development of neural-network-based machine learning algorithms, deep learning methods are being tentatively used in a much wider range than well-known artificial intelligence applications such as face recognition or auto-driving. Recently, deep learning models are investigated intensively to improve the compression efficiency for video coding, especially at the in-loop filtering stage. Although deep learning-based in-loop filtering methods in prior arts have already shown a remarkable potential capability in video coding, content propagation issue is still not well recognized and addressed yet. Content propagation is the fact that contents of reference frames are propagated to frames referring to them, which typically leads to over-filtering issues. In this article, we develop an iteratively trained deep in-loop filter with adaptive model selection (iDAM) to address the content propagation issue. First, we propose an iterative training scheme, which enables the network to gradually take into account the impacts of content propagation. Second, we propose a filter selection mechanism, i.e., allowing a block to select from a set of candidate filters with different filtering strengths. Besides, we propose a novel approach to design a conditional in-loop filtering method that can deal with multiple quality levels with a single model and serve the functionality of filter selection by modifying the input parameters. Extensive experiments on top of the latest video coding standard (Versatile Video Coding, VVC) have been conducted to evaluate the proposed techniques. Compared with VTM-11.0, our scheme achieves a new state-of-the-art, leading to {7.91%, 20.25%, 20.44%}, {11.64%, 26.40%, 26.50%}, and {10.97%, 26.63%, 26.77%} BD-rate reductions on average for {Y, Cb, Cr} under all-intra, random-access, and low-delay configurations, respectively. As far as we know, our proposed iDAM scheme provides the highest coding performance compared to all existing solutions. In addition, the syntax elements of the proposed scheme were adopted at the 76th meeting of Audio Video coding Standard (AVS) held this year.
- [1] . 2001. Calcuation of Average PSNR Differences Between RD-curves.
Technical Report VCEG-M33. VCEG.Google Scholar - [2] . 2021. AHG11: Replacing SAO in-loop filter with neural networks. JVET-V0092 (
Apr. 2021).Google Scholar - [3] . 2018. JVET common test conditions and software reference configurations for SDR video. JVET-K1010 (
Sep. 2018).Google Scholar - [4] . 2020. Versatile video coding (draft 10). JVET-S2001 (
Sep. 2020).Google Scholar - [5] . 2021. EE-2.1.5: In-loop filtering based on neural network. JVET-U0101 (
Jan. 2021).Google Scholar - [6] . 2008. Adaptive (Wiener) filter for video compression. ITU-T SG16 Contribution C 437 (2008).Google Scholar
- [7] . 2017. A convolutional neural network approach for post-processing in HEVC intra coding. In Proceedings of the International Conference on Multimedia Modeling. Springer, 28–39.Google Scholar
Cross Ref
- [8] . 2015. Compression artifacts reduction by a deep convolutional network. In IEEE International Conference on Computer Vision. 576–584.Google Scholar
Digital Library
- [9] . 2012. Sample adaptive offset in the HEVC standard. IEEE Trans. Circ. Syst. Vid. Technol. 22, 12 (2012), 1755–1764.Google Scholar
Digital Library
- [10] . 2019. MFQE 2.0: A new approach for multi-frame quality enhancement on compressed video. IEEE Trans. Pattern Anal. Mach. Intell. (2019).Google Scholar
- [11] . 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google Scholar
- [12] . 2019. Content-aware convolutional neural network for in-loop filtering in high efficiency video coding. IEEE Trans. Image Process. 28, 7 (2019), 3343–3356.Google Scholar
Digital Library
- [13] . 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- [14] . 2018. Fully connected network-based intra prediction for image coding. IEEE Trans. Image Process. 27, 7 (2018), 3236–3247.Google Scholar
Cross Ref
- [15] . 2021. Hook for AI coding. AVS-M6363 (
Apr. 2021).Google Scholar - [16] . 2017. Convolutional neural network-based block up-sampling for intra frame coding. IEEE Trans. Circ. Syst. Vid. Technol. 28, 9 (2017), 2316–2330.Google Scholar
Cross Ref
- [17] . 2021. AHG11: Convolutional neural network-based in-loop filter with adaptive model selection. JVET-U0068 (
Jan. 2021).Google Scholar - [18] . 2021. AHG11: Deep in-loop filter with adaptive model selection. JVET-V0100 (
Apr. 2021).Google Scholar - [19] . 2018. Convolutional neural network-based block up-sampling for HEVC. IEEE Trans. Circ. Syst. Vid. Technol. 29, 12 (2018), 3701–3715.Google Scholar
Cross Ref
- [20] . 2003. Adaptive deblocking filter. IEEE Trans. Circ. Syst. Vid. Technol. 13, 7 (2003), 614–619.Google Scholar
Digital Library
- [21] . 2020. Deep learning-based video coding: A review and a case study. ACM Comput. Surv. 53, 1 (2020), 1–35.Google Scholar
Digital Library
- [22] . 2020. A comprehensive benchmark for single image compression artifact reduction. IEEE Trans. Image Process. 29 (2020), 7845–7860.Google Scholar
Cross Ref
- [23] . 2020. MFRNet: A new CNN architecture for post-processing and in-loop filtering. IEEE J. Select. Topics. Sig. Process. (2020).Google Scholar
- [24] . 2020. BVI-DVC: A training database for deep video compression. arXiv preprint arXiv:2003.13552 (2020).Google Scholar
- [25] . 2020. Improving compression artifact reduction via end-to-end learning of side information. In Proceedings of the IEEE International Conference on Visual Communications and Image Processing (VCIP). IEEE, 403–406.Google Scholar
Cross Ref
- [26] . 2019. Image and video compression with neural networks: A review. IEEE Trans. Circ. Syst. Vid. Technol. 30, 6 (2019), 1683–1698.Google Scholar
Cross Ref
- [27] . 2018. A new HEVC in-loop filter based on multi-channel long-short-term dependency residual networks. In Proceedings of the Data Compression Conference. IEEE, 187–196.Google Scholar
Cross Ref
- [28] . 2019. On cross component adaptive loop filter for video compression. In Proceedings of the Picture Coding Symposium (PCS). IEEE, 1–5.Google Scholar
Cross Ref
- [29] . 2021. A CNN-based prediction-aware quality enhancement framework for VVC. IEEE Open J. Sig. Process. 2 (2021), 466–483.Google Scholar
Cross Ref
- [30] . 2012. HEVC deblocking filter. IEEE Trans. Circ. Syst. Vid. Technol. 22, 12 (2012), 1746–1754.Google Scholar
Digital Library
- [31] . 2016. CNN-based in-loop filtering for coding efficiency improvement. In Proceedings of the IVMSP. IEEE, 1–5.Google Scholar
Cross Ref
- [32] . 2019. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 8026–8037.Google Scholar
- [33] . 1992. JPEG: Still Image Data Compression Standard. Springer Science & Business Media.Google Scholar
Digital Library
- [34] . 2017. VMAF reproducibility: Validating a perceptual practical video quality metric. In Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). IEEE, 1–2.Google Scholar
Cross Ref
- [35] . 2006. Analysis of hierarchical b pictures and MCTF. In Proceedings of the IEEE International Conference on Multimedia and Expo. IEEE, 1929–1932.Google Scholar
Cross Ref
- [36] . 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1874–1883.Google Scholar
Cross Ref
- [37] . 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circ. Syst. Vid. Technol. 22, 12 (2012), 1649–1668.Google Scholar
Digital Library
- [38] . 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, 6105–6114.Google Scholar
- [39] . 2017. NTIRE 2017 challenge on single image super-resolution: Methods and results. In Proceedings of the CVPR Workshops. IEEE, 1110–1121.Google Scholar
Cross Ref
- [40] . 2018. Deep neural network compression by in-parallel pruning-quantization. IEEE Trans. Pattern Anal. Mach. Intell. 42, 3 (2018), 568–579.Google Scholar
Cross Ref
- [41] . 2021. Combining progressive rethinking and collaborative learning: A deep framework for in-loop filtering. IEEE Trans. Image Process. 30 (2021), 4198–4211.Google Scholar
Cross Ref
- [42] . 2019. Attention-based dual-scale CNN in-loop filter for versatile Video Coding. IEEE Access 7 (2019), 145214–145226.Google Scholar
Cross Ref
- [43] . 2021. Multi-density attention network for loop filtering in video compression. arXiv preprint arXiv:2104.12865 (2021).Google Scholar
- [44] . 2003. Multiscale structural similarity for image quality assessment. In Proceedings of the 37th Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2. IEEE, 1398–1402.Google Scholar
Cross Ref
- [45] . 2018. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV). 3–19.Google Scholar
Digital Library
- [46] . 2016. Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4820–4828.Google Scholar
Cross Ref
- [47] . 2018. Convolutional neural network-based fractional-pixel motion compensation. IEEE Trans. Circ. Syst. Vid. Technol. 29, 3 (2018), 840–853.Google Scholar
Digital Library
- [48] . 2018. Enhancing quality for HEVC compressed videos. IEEE Trans. Circ. Syst. Vid. Technol.
DOI: Google ScholarCross Ref
- [49] . 2020. Enhancing VVC through CNN-based post-processing. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1–6.Google Scholar
Cross Ref
- [50] . 2018. Enhanced cross-component linear model for chroma intra-prediction in video coding. IEEE Trans. Image Process. 27, 8 (2018), 3983–3997.Google Scholar
- [51] . 2021. Self-distillation: Towards efficient and compact neural networks. IEEE Trans. Pattern Anal. Mach. Intell. (2021).Google Scholar
- [52] . 2018. Adaptive residual networks for high-quality image restoration. IEEE Trans. Image Process. 27, 7 (2018), 3150–3163.Google Scholar
Cross Ref
- [53] . 2019. Enhanced motion-compensated video coding with deep virtual reference frame generation. IEEE Trans. Image Process. 28, 10 (2019), 4832–4844.Google Scholar
Cross Ref
Index Terms
iDAM: Iteratively Trained Deep In-loop Filter with Adaptive Model Selection
Recommendations
Classified quadtree-based adaptive loop filter
ICME '11: Proceedings of the 2011 IEEE International Conference on Multimedia and ExpoIn this paper, we propose a classified quadtree-based adaptive loop filter (CQALF) in video coding. Pixels in a picture are classified into two categories by considering the impact of the deblocking filter, the pixels that are modified and the pixels ...
Macroblock-Based Adaptive Loop Filter for Video Compression
CMSP '11: Proceedings of the 2011 International Conference on Multimedia and Signal Processing - Volume 02In this paper, we propose an adaptive loop filter that can work on macroblock-level encoding. First, we calculate the filter coefficients that minimize the mean square error between original and encoded frames, and we then apply 2D-filtering to the ...
Deep Learning-Based Intra Mode Derivation for Versatile Video Coding
In intra coding, Rate Distortion Optimization (RDO) is performed to achieve the optimal intra mode from a pre-defined candidate list. The optimal intra mode is also required to be encoded and transmitted to the decoder side besides the residual signal, ...






Comments