skip to main content
research-article

Structure-aware Video Style Transfer with Map Art

Published:24 February 2023Publication History
Skip Abstract Section

Abstract

Changing the style of an image/video while preserving its content is a crucial criterion to access a new neural style transfer algorithm. However, it is very challenging to transfer a new map art style to a certain video in which “content” comprises a map background and animation objects. In this article, we present a novel comprehensive system that solves the problems in transferring map art style in such video. Our system takes as input an arbitrary video, a map image, and an off-the-shelf map art image. It then generates an artistic video without damaging the functionality of the map and the consistency in details. To solve this challenge, we propose a novel network, Map Art Video Network (MAViNet), the tailored objective functions, and a rich training set with rich animation contents and different map structures. We have evaluated our method on various challenging cases and many comparisons with those of the related works. Our method substantially outperforms state-of-the-art methods in terms of visual quality and meets the mentioned criteria in this research domain.

REFERENCES

  1. [1] An Jie, Huang Siyu, Song Yibing, Dou Dejing, Liu Wei, and Luo Jiebo. 2021. ArtFlow: Unbiased image style transfer via reversible neural flows. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 862871.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Cetinic Eva and She James. 2022. Understanding and creating art with AI: Review and outlook. ACM Trans. Multim. Comput. Commun. Applic. 18, 2 (2022), 122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Champandard Alex J.. 2016. Semantic style transfer and turning two-bit doodles into fine artworks. arXiv preprint arXiv:1603.01768 (2016).Google ScholarGoogle Scholar
  4. [4] Chen Dongdong, Liao Jing, Yuan Lu, Yu Nenghai, and Hua Gang. 2017. Coherent online video style transfer. In Proceedings of the IEEE International Conference on Computer Vision. 11051114.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Chen Dongdong, Yuan Lu, Liao Jing, Yu Nenghai, and Hua Gang. 2017. StyleBank: An explicit representation for neural image style transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 18971906.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Chen Yi-Wen, Tsai Yi-Hsuan, Yang Chu-Ya, Lin Yen-Yu, and Yang Ming-Hsuan. 2018. Unseen object segmentation in videos via transferable representations. In Proceedings of the Asian Conference on Computer Vision. Springer, 615631.Google ScholarGoogle Scholar
  7. [7] Deng Yingying, Tang Fan, Dong Weiming, Huang Haibin, Ma Chongyang, and Xu Changsheng. 2021. Arbitrary video style transfer via multi-channel correlation. In Proceedings of the AAAI Conference on Artificial Intelligence. 12101217.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Dosovitskiy Alexey, Fischer Philipp, Ilg Eddy, Hausser Philip, Hazirbas Caner, Golkov Vladimir, Smagt Patrick Van Der, Cremers Daniel, and Brox Thomas. 2015. FlowNet: Learning optical flow with convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision. 27582766.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Dumoulin Vincent, Shlens Jonathon, and Kudlur Manjunath. 2016. A learned representation for artistic style. arXiv preprint arXiv:1610.07629 (2016).Google ScholarGoogle Scholar
  10. [10] Ed Fairburn 2021. Ed-Fairburn, Original Artwork and Illustration. Retrieved from https://edfairburn.com/.Google ScholarGoogle Scholar
  11. [11] Chang Gao, Derun Gu, Fangjun Zhang, and Yizhou Yu. 2018. Reconet: Real-time coherent video style transfer network. Asian Conference on Computer Vision, Springer, 637–653.Google ScholarGoogle Scholar
  12. [12] Gao C., Gu D., Zhang F., and Reconet Y. Yu. 2018. Real-time coherent video style transfer network. In Proceedings of the Asian Conference on Computer Vision.Google ScholarGoogle Scholar
  13. [13] Gao Chang, Gu Derun, Zhang Fangjun, and Yu Yizhou. 2018. ReCoNet: Real-time coherent video style transfer network. In Proceedings of the Asian Conference on Computer Vision. Springer, 637653.Google ScholarGoogle Scholar
  14. [14] Gatys Leon A., Ecker Alexander S., and Bethge Matthias. 2015. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015).Google ScholarGoogle Scholar
  15. [15] Gatys Leon A., Ecker Alexander S., and Bethge Matthias. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 24142423.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Gooch Amy. 2001. Non-photorealistic Rendering. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Gupta Agrim, Johnson Justin, Alahi Alexandre, and Fei-Fei Li. 2017. Characterizing and improving stability in neural style transfer. In Proceedings of the IEEE International Conference on Computer Vision. 40674076.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Hore Alain and Ziou Djemel. 2010. Image quality metrics: PSNR vs. SSIM. In Proceedings of the 20th International Conference on Pattern Recognition. IEEE, 23662369.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Huang Haozhi, Wang Hao, Luo Wenhan, Ma Lin, Jiang Wenhao, Zhu Xiaolong, Li Zhifeng, and Liu Wei. 2017. Real-time neural style transfer for videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 783791.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Huang Xun and Belongie Serge. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision. 15011510.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Ioffe Sergey and Szegedy Christian. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning. PMLR, 448456.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Jing Yongcheng, Yang Yezhou, Feng Zunlei, Ye Jingwen, Yu Yizhou, and Song Mingli. 2019. Neural style transfer: A review. IEEE Trans. Visualiz. Comput. Graph. 26, 11 (2019), 33653385.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Johnson Justin, Alahi Alexandre, and Fei-Fei Li. 2016. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision. Springer, 694711.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Karras Tero, Laine Samuli, Aittala Miika, Hellsten Janne, Lehtinen Jaakko, and Aila Timo. 2020. Analyzing and improving the image quality of StyleGAN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 81108119.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Kingma Diederik P. and Ba Jimmy. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  27. [27] Li Xueting, Liu Sifei, Kautz Jan, and Yang Ming-Hsuan. 2019. Learning linear transformations for fast image and video style transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 38093817.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Liao Jing, Yao Yuan, Yuan Lu, Hua Gang, and Kang Sing Bing. 2017. Visual attribute transfer through deep image analogy. arXiv preprint arXiv:1705.01088 (2017).Google ScholarGoogle Scholar
  29. [29] Lin Honglin, Wang Mengmeng, Liu Yong, and Kou Jiaxin. 2022. Correlation-based and content-enhanced network for video style transfer. Pattern Anal. Applic. (2022), 113.Google ScholarGoogle Scholar
  30. [30] Liu Songhua, Lin Tianwei, He Dongliang, Li Fu, Wang Meiling, Li Xin, Sun Zhengxing, Li Qian, and Ding Errui. 2021. AdaAttN: Revisit attention mechanism in arbitrary neural style transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 66496658.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Liu Yun, Cheng Ming-Ming, Hu Xiaowei, Wang Kai, and Bai Xiang. 2017. Richer convolutional features for edge detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 30003009.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Luan Fujun, Paris Sylvain, Shechtman Eli, and Bala Kavita. 2017. Deep photo style transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 49904998.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Nair Vinod and Hinton Geoffrey E.. 2010. Rectified linear units improve restricted Boltzmann machines. In International Conference on Machine Learning (ICML).Google ScholarGoogle Scholar
  34. [34] Odena Augustus, Dumoulin Vincent, and Olah Chris. 2016. Deconvolution and checkerboard artifacts. Distill 1, 10 (2016), e3.Google ScholarGoogle Scholar
  35. [35] Paszke Adam, Gross Sam, Massa Francisco, Lerer Adam, Bradbury James, Chanan Gregory, Killeen Trevor, Lin Zeming, Gimelshein Natalia, Antiga Luca et al. 2019. PyTorch: An imperative style, high-performance deep learning library. arXiv preprint arXiv:1912.01703 (2019).Google ScholarGoogle Scholar
  36. [36] Pont-Tuset Jordi, Perazzi Federico, Caelles Sergi, Arbeláez Pablo, Sorkine-Hornung Alexander, and Gool Luc Van. 2017. The 2017 DAVIS challenge on video object segmentation. arXiv:1704.00675 (2017).Google ScholarGoogle Scholar
  37. [37] Radford Alec, Metz Luke, and Chintala Soumith. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).Google ScholarGoogle Scholar
  38. [38] Rebelo Ana Daniela Peres, Inês Guedes De Oliveira, and Damion D. E. Verboom. 2022. The impact of artificial intelligence on the creativity of videos. ACM Trans. Multim. Comput. Commun. Applic. 18, 1 (2022), 127.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Rosin Paul L. and Collomosse J.. 2013. Image and video-based artistic stylisation. In Computational Imaging and Vision. Springer.Google ScholarGoogle Scholar
  40. [40] Ruder Manuel, Dosovitskiy Alexey, and Brox Thomas. 2016. Artistic style transfer for videos. In Proceedings of the German Conference on Pattern Recognition. Springer, 2636.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Ruder Manuel, Dosovitskiy Alexey, and Brox Thomas. 2018. Artistic style transfer for videos and spherical images. Int. J. Comput. Vis. 126, 11 (2018), 11991219.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Selim Ahmed, Elgharib Mohamed, and Doyle Linda. 2016. Painting style transfer for head portraits using convolutional neural networks. ACM Trans. Graph. 35, 4 (2016), 118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Shih Chiao-Yin, Chen Ya-Hsuan, and Lee Tong-Yee. 2021. Map art style transfer with multi-stage framework. Multim. Tools Applic. 80, 3 (2021), 42794293.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Simonyan Karen and Zisserman Andrew. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  45. [45] Singh Saurabh and Krishnan Shankar. 2020. Filter response normalization layer: Eliminating batch dependence in the training of deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1123711246.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Strothotte Thomas and Schlechtweg Stefan. 2002. Non-photorealistic Computer Graphics: Modeling, Rendering, and Animation. Morgan Kaufmann.Google ScholarGoogle Scholar
  47. [47] Ulyanov Dmitry, Vedaldi Andrea, and Lempitsky Victor. 2017. Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 69246932.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Xu Kai, Wen Longyin, Li Guorong, Qi Honggang, Bo Liefeng, and Huang Qingming. 2021. Learning self-supervised space-time CNN for fast video style transfer. IEEE Trans. Image Process. 30 (2021), 25012512.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Yeh Mao-Chuang and Tang Shuai. 2018. Improved style transfer by respecting inter-layer correlations. arXiv preprint arXiv:1801.01933 (2018).Google ScholarGoogle Scholar
  50. [50] Zhang Hang and Dana Kristin. 2018. Multi-style generative network for real-time transfer. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops.Google ScholarGoogle Scholar
  51. [51] Zhang Lingzhi, Wen Tarmily, and Shi Jianbo. 2020. Deep image blending. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 231240.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Zhang Richard, Isola Phillip, Efros Alexei A., Shechtman Eli, and Wang Oliver. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 586595.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Structure-aware Video Style Transfer with Map Art

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 3s
      June 2023
      270 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3582887
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 February 2023
      • Online AM: 23 November 2022
      • Accepted: 11 November 2022
      • Revised: 2 June 2022
      • Received: 21 October 2021
      Published in tomm Volume 19, Issue 3s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)195
      • Downloads (Last 6 weeks)32

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!