skip to main content
research-article

Neural-Network-Based Cross-Channel Intra Prediction

Authors Info & Claims
Published:22 July 2021Publication History
Skip Abstract Section

Abstract

To reduce the redundancy among different color channels, e.g., YUV, previous methods usually adopt a linear model that tends to be oversimple for complex image content. We propose a neural-network-based method for cross-channel prediction in intra frame coding. The proposed network utilizes twofold cues, i.e., the neighboring reconstructed samples with all channels, and the co-located reconstructed samples with partial channels. Specifically, for YUV video coding, the neighboring samples with YUV are processed by several fully connected layers; the co-located samples with Y are processed by convolutional layers; and the proposed network fuses the twofold cues. We observe that the integration of twofold information is crucial to the performance of intra prediction of the chroma components. We have designed the network architecture to achieve a good balance between compression performance and computational efficiency. Moreover, we propose a transform domain loss for the training of the network. The transform domain loss helps obtain more compact representations of residues in the transform domain, leading to higher compression efficiency. The proposed method is plugged into HEVC and VVC test models to evaluate its effectiveness. Experimental results show that our method provides more accurate cross-channel intra prediction compared with previous methods. On top of HEVC, our method achieves on average 1.3%, 5.4%, and 3.8% BD-rate reductions for Y, Cb, and Cr on common test sequences, and on average 3.8%, 11.3%, and 9.0% BD-rate reductions for Y, Cb, and Cr on ultra-high-definition test sequences. On top of VVC, our method achieves on average 0.5%, 1.7%, and 1.3% BD-rate reductions for Y, Cb, and Cr on common test sequences.

References

  1. Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. End-to-end optimization of nonlinear transform codes for perceptual quality. In Picture Coding Symposium (PCS’16). IEEE, 1–5.Google ScholarGoogle Scholar
  2. Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. End-to-end optimized image compression. (2016). arXiv:1611.01704 http://arxiv.org/abs/1611.01704.Google ScholarGoogle Scholar
  3. Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie Line Alberi-Morel. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In British Machine Vision Conference (BMVC’12). BMVA Press, 1–10.Google ScholarGoogle ScholarCross RefCross Ref
  4. Gisle Bjontegaard. 2001. Calculation of Average PSNR Differences between RD-Curves. Technical Report VCEG-M33. VCEG.Google ScholarGoogle Scholar
  5. Frank Bossen. 2011. Common Test Conditions and Software Reference Configurations. Technical Report JCTVC-F900. JCT-VC.Google ScholarGoogle Scholar
  6. Frank Bossen, Jill Boyce, X. Li, V. Seregin, and K. Sühring. 2018. JVET Common Test Conditions and Software Reference Configurations for SDR Video. Technical Report JVET-L1010. JVET.Google ScholarGoogle Scholar
  7. Grigore C. Burdea and Philippe Coiffet. 2003. Virtual Reality Technology. John Wiley & Sons. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Guillaume Charpiat, Matthias Hofmann, and Bernhard Schölkopf. 2008. Automatic image colorization via multimodal predictions. In European Conference on Computer Vision (ECCV’08). Springer, 126–139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Zezhou Cheng, Qingxiong Yang, and Bin Sheng. 2015. Deep colorization. In International Conference on Computer Vision (ICCV’15). 415–423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Alex Yong-Sang Chia, Shaojie Zhuo, Raj Kumar Gupta, Yu-Wing Tai, Siu-Yeung Cho, Ping Tan, and Stephen Lin. 2011. Semantic colorization with internet images. ACM Transactions on Graphics 30, 6 (2011), 156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yuanying Dai, Dong Liu, and Feng Wu. 2017. A convolutional neural network approach for post-processing in HEVC intra coding. In Multimedia Modeling Conference (MMM’17). Springer, 28–39.Google ScholarGoogle ScholarCross RefCross Ref
  12. Aditya Deshpande, Jason Rock, and David Forsyth. 2015. Learning large-scale automatic image colorization. In International Conference on Computer Vision (ICCV’15). 567–575. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chao Dong, Change Loy Chen, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. In European Conference on Computer Vision (ECCV’14). Springer, 184–199.Google ScholarGoogle ScholarCross RefCross Ref
  14. Christophe Gisquet and Edouard François. 2013. Model correction for cross-channel chroma prediction. In Data Compression Conference (DCC’13). IEEE, 23–32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In International Conference on Artificial Intelligence and Statistics. 249–256.Google ScholarGoogle Scholar
  16. Marc Górriz, Saverio Blasi, Alan F. Smeaton, Noel E. O’Connor, and Marta Mrak. 2020. Chroma intra prediction with attention-based CNN architectures. (2020). arXiv:2006.15349 http://arxiv.org/abs/2006.15349.Google ScholarGoogle Scholar
  17. Raj Kumar Gupta, Alex Yong-Sang Chia, Deepu Rajan, Ee Sin Ng, and Huang Zhiyong. 2012. Image colorization using similar images. In ACM Multimedia. ACM, 369–378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Philipp Helle, Jonathan Pfaff, Michael Schäfer, Roman Rischke, Heiko Schwarz, Detlev Marpe, and Thomas Wiegand. 2019. Intra picture prediction for video coding with neural networks. In Data Compression Conference (DCC’19). IEEE, 448–457.Google ScholarGoogle ScholarCross RefCross Ref
  19. Yueyu Hu, Wenhan Yang, Mading Li, and Jiaying Liu. 2019. Progressive spatial recurrent neural network for intra prediction. IEEE Transactions on Multimedia 21, 12 (2019), 3024–3037.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 5197–5206.Google ScholarGoogle ScholarCross RefCross Ref
  21. Yi-Chin Huang, Yi-Shin Tung, Jun-Cheng Chen, Sung-Wen Wang, and Ja-Ling Wu. 2005. An adaptive edge detection based colorization algorithm and its applications. In ACM Multimedia. ACM, 351–354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Transactions on Graphics 35, 4 (2016), 110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In ACM Multimedia. ACM, 675–678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jungsun Kim, S. Park, Younghee Choi, Y. Jeon, and B. Jeon. 2010. New Intra Chroma Prediction Using Inter-channel Correlation. Technical Report JCTVC-B021. JCT-VC.Google ScholarGoogle Scholar
  25. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25 (NIPS’12). 1097–1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jani Lainema, Frank Bossen, Woo-Jin Han, Junghye Min, and Kemal Ugur. 2012. Intra coding of the HEVC standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1792–1801. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Edmund Y. Lam and Joseph W. Goodman. 2000. A mathematical analysis of the DCT coefficient distributions for images. IEEE Transactions on Image Processing 9, 10 (2000), 1661–1666. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. 2016. Learning representations for automatic colorization. In European Conference on Computer Vision (ECCV’16). Springer, 577–593.Google ScholarGoogle ScholarCross RefCross Ref
  29. Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization using optimization. ACM Transactions on Graphics 23, 3 (2004), 689–694. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jiahao Li, Bin Li, Jizheng Xu, Ruiqin Xiong, and Wen Gao. 2018. Fully connected network-based intra prediction for image coding. IEEE Transactions on Image Processing 27, 7 (2018), 3236–3247.Google ScholarGoogle ScholarCross RefCross Ref
  31. Yue Li, Li Li, Zhu Li, Jianchao Yang, Ning Xu, Dong Liu, and Houqiang Li. 2018. A hybrid neural network for chroma intra prediction. In International Conference on Image Processing (ICIP’18). 1797–1801.Google ScholarGoogle ScholarCross RefCross Ref
  32. Yue Li, Dong Liu, Houqiang Li, Li Li, Feng Wu, Hong Zhang, and Haitao Yang. 2018. Convolutional neural network-based block up-sampling for intra frame coding. IEEE Transactions on Circuits and Systems for Video Technology 28, 9 (2018), 2316–2330.Google ScholarGoogle ScholarCross RefCross Ref
  33. Dong Liu, Yue Li, Jianping Lin, Houqiang Li, and Feng Wu. 2020. Deep learning-based video coding: A review and a case study. Computing Surveys 53, 1 (2020), 1–35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Zhenyu Liu, Xianyu Yu, Yuan Gao, Shaolin Chen, Xiangyang Ji, and Dongsheng Wang. 2016. CU partition mode decision for HEVC hardwired intra encoder using convolution neural network. IEEE Transactions on Image Processing 25, 11 (2016), 5088–5103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Maria Meyer, Jonathan Wiesner, Jens Schneider, and Christian Rohlfing. 2019. Convolutional neural networks for video intra prediction using cross-component adaptation. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP’19). IEEE, 1607–1611.Google ScholarGoogle ScholarCross RefCross Ref
  36. J. Pfaff, P. Helle, D. Maniry, S. Kaltenstadler, W. Samek, H. Schwarz, D. Marpe, and T. Wiegand. 2018. Neural network based intra prediction for video coding. In Applications of Digital Image Processing XLI, Vol. 10752. International Society for Optics and Photonics.Google ScholarGoogle Scholar
  37. Jonathan Pfaff, Heiko Schwarz, Detlev Marpe, et al. 2020. Video compression using generalized binary partitioning, trellis coded quantization, perceptually optimized encoding, and advanced prediction and transform coding. IEEE Transactions on Circuits and Systems for Video Technology 30, 5 (2020), 1281–1295.Google ScholarGoogle ScholarCross RefCross Ref
  38. A. Segall, V. Baroncini, J. Boyce, J. Chen, and T. Suzuki. 2017. Joint Call for Proposals on Video Compression with Capability Beyond HEVC. Technical Report JVET-H1002. JVET.Google ScholarGoogle Scholar
  39. Rui Song, Dong Liu, Houqiang Li, and Feng Wu. 2017. Neural network-based arithmetic coding of intra prediction modes in HEVC. In International Conference on Visual Communications and Image Processing (VCIP’17). IEEE, 1–4.Google ScholarGoogle ScholarCross RefCross Ref
  40. Yafei Song, Jia Li, Xiaogang Wang, and Xiaowu Chen. 2017. Single image dehazing using ranking convolutional neural network. IEEE Transactions on Multimedia 20, 6 (2017), 1548–1560.Google ScholarGoogle ScholarCross RefCross Ref
  41. Gary J. Sullivan, Jens Ohm, Woo-Jin Han, and Thomas Wiegand. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1649–1668. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Youbao Tang and Xiangqian Wu. 2019. Salient object detection using cascaded convolutional neural networks and adversarial learning. IEEE Transactions on Multimedia 21, 9 (2019), 2237–2247.Google ScholarGoogle ScholarCross RefCross Ref
  43. Lucas Theis, Wenzhe Shi, Andrew Cunningham, and Ferenc Huszár. 2017. Lossy image compression with compressive autoencoders. (2017). arXiv:1703.00395 http://arxiv.org/abs/1703.00395.Google ScholarGoogle Scholar
  44. Radu Timofte, Eirikur Agustsson, Luc Van Gool, et al. 2017. NTIRE 2017 challenge on single image super-resolution: Methods and results. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops’17). IEEE, 1110–1121.Google ScholarGoogle Scholar
  45. George Toderici, Sean M. O’Malley, Sung Jin Hwang, Damien Vincent, David Minnen, Shumeet Baluja, Michele Covell, and Rahul Sukthankar. 2015. Variable rate image compression with recurrent neural networks. (2015). arXiv:1511.06085 http://arxiv.org/abs/1511.06085.Google ScholarGoogle Scholar
  46. George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, and Michele Covell. 2017. Full resolution image compression with recurrent neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5435–5443.Google ScholarGoogle ScholarCross RefCross Ref
  47. Thomas Wiegand, Gary J. Sullivan, Gisle Bjontegaard, and Ajay Luthra. 2003. Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology 13, 7 (2003), 560–576. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Junyuan Xie, Linli Xu, and Enhong Chen. 2012. Image denoising and inpainting with deep neural networks. In Advances in Neural Information Processing Systems 25 (NIPS’12). 341–349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Ning Yan, Dong Liu, Houqiang Li, Bin Li, Li Li, and Feng Wu. 2019. Convolutional neural network-based fractional-pixel motion compensation. IEEE Transactions on Circuits and Systems for Video Technology 29, 3 (2019), 840–853. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Chih-Yuan Yang and Ming-Hsuan Yang. 2013. Fast direct super-resolution by simple functions. In International Conference on Computer Vision (ICCV’13). 561–568. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Chia-Hung Yeh, Tsung-Yi Tseng, Cheng-Wei Lee, and Chih-Yang Lin. 2015. Predictive texture synthesis-based intra coding scheme for advanced video coding. IEEE Transactions on Multimedia 17, 9 (2015), 1508–1514.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Roman Zeyde, Michael Elad, and Matan Protter. 2010. On single image scale-up using sparse-representations. In International Conference on Curves and Surfaces. Springer, 711–730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Kai Zhang, Jianle Chen, Li Zhang, Xiang Li, and Marta Karczewicz. 2018. Enhanced cross-component linear model for chroma intra-prediction in video coding. IEEE Transactions on Image Processing 27, 8 (2018), 3983–3997.Google ScholarGoogle ScholarCross RefCross Ref
  54. Richard Zhang, Phillip Isola, and Alexei A. Efros. 2016. Colorful image colorization. In European Conference on Computer Vision (ECCV’16). Springer, 649–666.Google ScholarGoogle Scholar
  55. Tao Zhang, Haoming Chen, Ming-Ting Sun, Debin Zhao, and Wen Gao. 2017. Signal dependent transform based on SVD for HEVC intracoding. IEEE Transactions on Multimedia 19, 11 (2017), 2404–2414.Google ScholarGoogle ScholarCross RefCross Ref
  56. Tao Zhang, Xiaopeng Fan, Debin Zhao, and Wen Gao. 2016. Improving chroma intra prediction for HEVC. In International Conference on Multimedia and Expo Workshops (ICME Workshops’16). IEEE, 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  57. Tao Zhang, Xiaopeng Fan, Debin Zhao, Ruiqin Xiong, and Wen Gao. 2017. Hybrid intraprediction based on local and nonlocal correlations. IEEE Transactions on Multimedia 20, 7 (2017), 1622–1635.Google ScholarGoogle ScholarCross RefCross Ref
  58. Tong Zhang, Wenming Zheng, Zhen Cui, Yuan Zong, Jingwei Yan, and Keyu Yan. 2016. A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Transactions on Multimedia 18, 12 (2016), 2528–2536. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Xingyu Zhang, Christophe Gisquet, Edouard Francois, Feng Zou, and Oscar C. Au. 2014. Chroma intra prediction based on inter-channel correlation for HEVC. IEEE Transactions on Image Processing 23, 1 (2014), 274–286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. 2017. Loss functions for image restoration with neural networks. IEEE Transactions on Computational Imaging 3, 1 (2017), 47–57.Google ScholarGoogle ScholarCross RefCross Ref
  61. Linwei Zhu, Sam Kwong, Yun Zhang, Shiqi Wang, and Xu Wang. 2019. Generative adversarial network based intra prediction for video coding. IEEE Transactions on Multimedia 22, 1 (2019), 45–58.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Neural-Network-Based Cross-Channel Intra Prediction

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!