Abstract
To reduce the redundancy among different color channels, e.g., YUV, previous methods usually adopt a linear model that tends to be oversimple for complex image content. We propose a neural-network-based method for cross-channel prediction in intra frame coding. The proposed network utilizes twofold cues, i.e., the neighboring reconstructed samples with all channels, and the co-located reconstructed samples with partial channels. Specifically, for YUV video coding, the neighboring samples with YUV are processed by several fully connected layers; the co-located samples with Y are processed by convolutional layers; and the proposed network fuses the twofold cues. We observe that the integration of twofold information is crucial to the performance of intra prediction of the chroma components. We have designed the network architecture to achieve a good balance between compression performance and computational efficiency. Moreover, we propose a transform domain loss for the training of the network. The transform domain loss helps obtain more compact representations of residues in the transform domain, leading to higher compression efficiency. The proposed method is plugged into HEVC and VVC test models to evaluate its effectiveness. Experimental results show that our method provides more accurate cross-channel intra prediction compared with previous methods. On top of HEVC, our method achieves on average 1.3%, 5.4%, and 3.8% BD-rate reductions for Y, Cb, and Cr on common test sequences, and on average 3.8%, 11.3%, and 9.0% BD-rate reductions for Y, Cb, and Cr on ultra-high-definition test sequences. On top of VVC, our method achieves on average 0.5%, 1.7%, and 1.3% BD-rate reductions for Y, Cb, and Cr on common test sequences.
- Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. End-to-end optimization of nonlinear transform codes for perceptual quality. In Picture Coding Symposium (PCS’16). IEEE, 1–5.Google Scholar
- Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. End-to-end optimized image compression. (2016). arXiv:1611.01704 http://arxiv.org/abs/1611.01704.Google Scholar
- Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie Line Alberi-Morel. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In British Machine Vision Conference (BMVC’12). BMVA Press, 1–10.Google Scholar
Cross Ref
- Gisle Bjontegaard. 2001. Calculation of Average PSNR Differences between RD-Curves. Technical Report VCEG-M33. VCEG.Google Scholar
- Frank Bossen. 2011. Common Test Conditions and Software Reference Configurations. Technical Report JCTVC-F900. JCT-VC.Google Scholar
- Frank Bossen, Jill Boyce, X. Li, V. Seregin, and K. Sühring. 2018. JVET Common Test Conditions and Software Reference Configurations for SDR Video. Technical Report JVET-L1010. JVET.Google Scholar
- Grigore C. Burdea and Philippe Coiffet. 2003. Virtual Reality Technology. John Wiley & Sons. Google Scholar
Digital Library
- Guillaume Charpiat, Matthias Hofmann, and Bernhard Schölkopf. 2008. Automatic image colorization via multimodal predictions. In European Conference on Computer Vision (ECCV’08). Springer, 126–139. Google Scholar
Digital Library
- Zezhou Cheng, Qingxiong Yang, and Bin Sheng. 2015. Deep colorization. In International Conference on Computer Vision (ICCV’15). 415–423. Google Scholar
Digital Library
- Alex Yong-Sang Chia, Shaojie Zhuo, Raj Kumar Gupta, Yu-Wing Tai, Siu-Yeung Cho, Ping Tan, and Stephen Lin. 2011. Semantic colorization with internet images. ACM Transactions on Graphics 30, 6 (2011), 156. Google Scholar
Digital Library
- Yuanying Dai, Dong Liu, and Feng Wu. 2017. A convolutional neural network approach for post-processing in HEVC intra coding. In Multimedia Modeling Conference (MMM’17). Springer, 28–39.Google Scholar
Cross Ref
- Aditya Deshpande, Jason Rock, and David Forsyth. 2015. Learning large-scale automatic image colorization. In International Conference on Computer Vision (ICCV’15). 567–575. Google Scholar
Digital Library
- Chao Dong, Change Loy Chen, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. In European Conference on Computer Vision (ECCV’14). Springer, 184–199.Google Scholar
Cross Ref
- Christophe Gisquet and Edouard François. 2013. Model correction for cross-channel chroma prediction. In Data Compression Conference (DCC’13). IEEE, 23–32. Google Scholar
Digital Library
- Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In International Conference on Artificial Intelligence and Statistics. 249–256.Google Scholar
- Marc Górriz, Saverio Blasi, Alan F. Smeaton, Noel E. O’Connor, and Marta Mrak. 2020. Chroma intra prediction with attention-based CNN architectures. (2020). arXiv:2006.15349 http://arxiv.org/abs/2006.15349.Google Scholar
- Raj Kumar Gupta, Alex Yong-Sang Chia, Deepu Rajan, Ee Sin Ng, and Huang Zhiyong. 2012. Image colorization using similar images. In ACM Multimedia. ACM, 369–378. Google Scholar
Digital Library
- Philipp Helle, Jonathan Pfaff, Michael Schäfer, Roman Rischke, Heiko Schwarz, Detlev Marpe, and Thomas Wiegand. 2019. Intra picture prediction for video coding with neural networks. In Data Compression Conference (DCC’19). IEEE, 448–457.Google Scholar
Cross Ref
- Yueyu Hu, Wenhan Yang, Mading Li, and Jiaying Liu. 2019. Progressive spatial recurrent neural network for intra prediction. IEEE Transactions on Multimedia 21, 12 (2019), 3024–3037.Google Scholar
Digital Library
- Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 5197–5206.Google Scholar
Cross Ref
- Yi-Chin Huang, Yi-Shin Tung, Jun-Cheng Chen, Sung-Wen Wang, and Ja-Ling Wu. 2005. An adaptive edge detection based colorization algorithm and its applications. In ACM Multimedia. ACM, 351–354. Google Scholar
Digital Library
- Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Transactions on Graphics 35, 4 (2016), 110. Google Scholar
Digital Library
- Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In ACM Multimedia. ACM, 675–678. Google Scholar
Digital Library
- Jungsun Kim, S. Park, Younghee Choi, Y. Jeon, and B. Jeon. 2010. New Intra Chroma Prediction Using Inter-channel Correlation. Technical Report JCTVC-B021. JCT-VC.Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25 (NIPS’12). 1097–1105. Google Scholar
Digital Library
- Jani Lainema, Frank Bossen, Woo-Jin Han, Junghye Min, and Kemal Ugur. 2012. Intra coding of the HEVC standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1792–1801. Google Scholar
Digital Library
- Edmund Y. Lam and Joseph W. Goodman. 2000. A mathematical analysis of the DCT coefficient distributions for images. IEEE Transactions on Image Processing 9, 10 (2000), 1661–1666. Google Scholar
Digital Library
- Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. 2016. Learning representations for automatic colorization. In European Conference on Computer Vision (ECCV’16). Springer, 577–593.Google Scholar
Cross Ref
- Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization using optimization. ACM Transactions on Graphics 23, 3 (2004), 689–694. Google Scholar
Digital Library
- Jiahao Li, Bin Li, Jizheng Xu, Ruiqin Xiong, and Wen Gao. 2018. Fully connected network-based intra prediction for image coding. IEEE Transactions on Image Processing 27, 7 (2018), 3236–3247.Google Scholar
Cross Ref
- Yue Li, Li Li, Zhu Li, Jianchao Yang, Ning Xu, Dong Liu, and Houqiang Li. 2018. A hybrid neural network for chroma intra prediction. In International Conference on Image Processing (ICIP’18). 1797–1801.Google Scholar
Cross Ref
- Yue Li, Dong Liu, Houqiang Li, Li Li, Feng Wu, Hong Zhang, and Haitao Yang. 2018. Convolutional neural network-based block up-sampling for intra frame coding. IEEE Transactions on Circuits and Systems for Video Technology 28, 9 (2018), 2316–2330.Google Scholar
Cross Ref
- Dong Liu, Yue Li, Jianping Lin, Houqiang Li, and Feng Wu. 2020. Deep learning-based video coding: A review and a case study. Computing Surveys 53, 1 (2020), 1–35. Google Scholar
Digital Library
- Zhenyu Liu, Xianyu Yu, Yuan Gao, Shaolin Chen, Xiangyang Ji, and Dongsheng Wang. 2016. CU partition mode decision for HEVC hardwired intra encoder using convolution neural network. IEEE Transactions on Image Processing 25, 11 (2016), 5088–5103. Google Scholar
Digital Library
- Maria Meyer, Jonathan Wiesner, Jens Schneider, and Christian Rohlfing. 2019. Convolutional neural networks for video intra prediction using cross-component adaptation. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP’19). IEEE, 1607–1611.Google Scholar
Cross Ref
- J. Pfaff, P. Helle, D. Maniry, S. Kaltenstadler, W. Samek, H. Schwarz, D. Marpe, and T. Wiegand. 2018. Neural network based intra prediction for video coding. In Applications of Digital Image Processing XLI, Vol. 10752. International Society for Optics and Photonics.Google Scholar
- Jonathan Pfaff, Heiko Schwarz, Detlev Marpe, et al. 2020. Video compression using generalized binary partitioning, trellis coded quantization, perceptually optimized encoding, and advanced prediction and transform coding. IEEE Transactions on Circuits and Systems for Video Technology 30, 5 (2020), 1281–1295.Google Scholar
Cross Ref
- A. Segall, V. Baroncini, J. Boyce, J. Chen, and T. Suzuki. 2017. Joint Call for Proposals on Video Compression with Capability Beyond HEVC. Technical Report JVET-H1002. JVET.Google Scholar
- Rui Song, Dong Liu, Houqiang Li, and Feng Wu. 2017. Neural network-based arithmetic coding of intra prediction modes in HEVC. In International Conference on Visual Communications and Image Processing (VCIP’17). IEEE, 1–4.Google Scholar
Cross Ref
- Yafei Song, Jia Li, Xiaogang Wang, and Xiaowu Chen. 2017. Single image dehazing using ranking convolutional neural network. IEEE Transactions on Multimedia 20, 6 (2017), 1548–1560.Google Scholar
Cross Ref
- Gary J. Sullivan, Jens Ohm, Woo-Jin Han, and Thomas Wiegand. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1649–1668. Google Scholar
Digital Library
- Youbao Tang and Xiangqian Wu. 2019. Salient object detection using cascaded convolutional neural networks and adversarial learning. IEEE Transactions on Multimedia 21, 9 (2019), 2237–2247.Google Scholar
Cross Ref
- Lucas Theis, Wenzhe Shi, Andrew Cunningham, and Ferenc Huszár. 2017. Lossy image compression with compressive autoencoders. (2017). arXiv:1703.00395 http://arxiv.org/abs/1703.00395.Google Scholar
- Radu Timofte, Eirikur Agustsson, Luc Van Gool, et al. 2017. NTIRE 2017 challenge on single image super-resolution: Methods and results. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops’17). IEEE, 1110–1121.Google Scholar
- George Toderici, Sean M. O’Malley, Sung Jin Hwang, Damien Vincent, David Minnen, Shumeet Baluja, Michele Covell, and Rahul Sukthankar. 2015. Variable rate image compression with recurrent neural networks. (2015). arXiv:1511.06085 http://arxiv.org/abs/1511.06085.Google Scholar
- George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, and Michele Covell. 2017. Full resolution image compression with recurrent neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5435–5443.Google Scholar
Cross Ref
- Thomas Wiegand, Gary J. Sullivan, Gisle Bjontegaard, and Ajay Luthra. 2003. Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology 13, 7 (2003), 560–576. Google Scholar
Digital Library
- Junyuan Xie, Linli Xu, and Enhong Chen. 2012. Image denoising and inpainting with deep neural networks. In Advances in Neural Information Processing Systems 25 (NIPS’12). 341–349. Google Scholar
Digital Library
- Ning Yan, Dong Liu, Houqiang Li, Bin Li, Li Li, and Feng Wu. 2019. Convolutional neural network-based fractional-pixel motion compensation. IEEE Transactions on Circuits and Systems for Video Technology 29, 3 (2019), 840–853. Google Scholar
Digital Library
- Chih-Yuan Yang and Ming-Hsuan Yang. 2013. Fast direct super-resolution by simple functions. In International Conference on Computer Vision (ICCV’13). 561–568. Google Scholar
Digital Library
- Chia-Hung Yeh, Tsung-Yi Tseng, Cheng-Wei Lee, and Chih-Yang Lin. 2015. Predictive texture synthesis-based intra coding scheme for advanced video coding. IEEE Transactions on Multimedia 17, 9 (2015), 1508–1514.Google Scholar
Digital Library
- Roman Zeyde, Michael Elad, and Matan Protter. 2010. On single image scale-up using sparse-representations. In International Conference on Curves and Surfaces. Springer, 711–730. Google Scholar
Digital Library
- Kai Zhang, Jianle Chen, Li Zhang, Xiang Li, and Marta Karczewicz. 2018. Enhanced cross-component linear model for chroma intra-prediction in video coding. IEEE Transactions on Image Processing 27, 8 (2018), 3983–3997.Google Scholar
Cross Ref
- Richard Zhang, Phillip Isola, and Alexei A. Efros. 2016. Colorful image colorization. In European Conference on Computer Vision (ECCV’16). Springer, 649–666.Google Scholar
- Tao Zhang, Haoming Chen, Ming-Ting Sun, Debin Zhao, and Wen Gao. 2017. Signal dependent transform based on SVD for HEVC intracoding. IEEE Transactions on Multimedia 19, 11 (2017), 2404–2414.Google Scholar
Cross Ref
- Tao Zhang, Xiaopeng Fan, Debin Zhao, and Wen Gao. 2016. Improving chroma intra prediction for HEVC. In International Conference on Multimedia and Expo Workshops (ICME Workshops’16). IEEE, 1–6.Google Scholar
Cross Ref
- Tao Zhang, Xiaopeng Fan, Debin Zhao, Ruiqin Xiong, and Wen Gao. 2017. Hybrid intraprediction based on local and nonlocal correlations. IEEE Transactions on Multimedia 20, 7 (2017), 1622–1635.Google Scholar
Cross Ref
- Tong Zhang, Wenming Zheng, Zhen Cui, Yuan Zong, Jingwei Yan, and Keyu Yan. 2016. A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Transactions on Multimedia 18, 12 (2016), 2528–2536. Google Scholar
Digital Library
- Xingyu Zhang, Christophe Gisquet, Edouard Francois, Feng Zou, and Oscar C. Au. 2014. Chroma intra prediction based on inter-channel correlation for HEVC. IEEE Transactions on Image Processing 23, 1 (2014), 274–286. Google Scholar
Digital Library
- Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. 2017. Loss functions for image restoration with neural networks. IEEE Transactions on Computational Imaging 3, 1 (2017), 47–57.Google Scholar
Cross Ref
- Linwei Zhu, Sam Kwong, Yun Zhang, Shiqi Wang, and Xu Wang. 2019. Generative adversarial network based intra prediction for video coding. IEEE Transactions on Multimedia 22, 1 (2019), 45–58.Google Scholar
Digital Library
Index Terms
Neural-Network-Based Cross-Channel Intra Prediction
Recommendations
Fast intra prediction for high efficiency video coding
PCM'12: Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information ProcessingEmerging High Efficiency Video Coding (HEVC) video coding standard promises the significant compression performance improvement compared to the H.264/AVC. However it comes with the tremendous encoding complexity increase. Thus, it is very useful and ...
Convolutional neural network based low complexity HEVC intra encoder
AbstractVideo coding is one of the key technologies of visual sensors. As the state-of-art video coding standard, High Efficiency Video Coding (HEVC) achieves a significant high compression ratio for video. However, it also introduces heavy computational ...
Fast intra coding unit decision for high efficiency video coding based on statistical information
The latest video coding compression standard is known as highefficiency video coding (HEVC). It supports high-resolution video sequences and has better coding performance than the previous standard H.264/AVC. A quad-tree based coding unit (CU) ...






Comments