skip to main content
research-article

Deep Iterative Frame Interpolation for Full-frame Video Stabilization

Published:16 January 2020Publication History
Skip Abstract Section

Abstract

Video stabilization is a fundamental and important technique for higher quality videos. Prior works have extensively explored video stabilization, but most of them involve cropping of the frame boundaries and introduce moderate levels of distortion. We present a novel deep approach to video stabilization that can generate video frames without cropping and low distortion. The proposed framework utilizes frame interpolation techniques to generate in between frames, leading to reduced inter-frame jitter. Once applied in an iterative fashion, the stabilization effect becomes stronger. A major advantage is that our framework is end-to-end trainable in an unsupervised manner. In addition, our method is able to run in near real-time (15 fps). To the best of our knowledge, this is the first work to propose an unsupervised deep approach to full-frame video stabilization. We show the advantages of our method through quantitative and qualitative evaluations comparing to the state-of-the-art methods.

References

  1. Jiamin Bai, Aseem Agarwala, Maneesh Agrawala, and Ravi Ramamoorthi. 2014. User-assisted video stabilization. In Proceedings of the Computer Graphics Forum, Vol. 33. 61--70.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Steven Bell, Alejandro Troccoli, and Kari Pulli. 2014. A non-linear filter for gyroscope-based video stabilization. In Proceedings of the European Conference on Computer Vision (ECCV’14). 294--308.Google ScholarGoogle ScholarCross RefCross Ref
  3. Chris Buehler, Michael Bosse, and Leonard McMillan. 2001. Non-metric image-based rendering for video stabilization. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR’01), Vol. 2. 609--614.Google ScholarGoogle ScholarCross RefCross Ref
  4. Bing-Yu Chen, Ken-Yi Lee, Wei-Ting Huang, and Jong-Shan Lin. 2008. Capturing intention-based full-frame video stabilization. In Computer Graphics Forum, Vol. 27. 1805--1814.Google ScholarGoogle ScholarCross RefCross Ref
  5. Michael L. Gleicher and Feng Liu. 2007. Re-cinematography: Improving the camera dynamics of casual video. In Proceedings of the ACM International Conference on Multimedia. 27--36.Google ScholarGoogle Scholar
  6. Amit Goldstein and Raanan Fattal. 2012. Video stabilization using epipolar geometry. ACM Trans. Graph. 31, 5 (2012), 126.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ross Goroshin, Michael F. Mathieu, and Yann LeCun. 2015. Learning to linearize under uncertainty. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS’15). 1234--1242.Google ScholarGoogle Scholar
  8. Matthias Grundmann, Vivek Kwatra, Daniel Castro, and Irfan Essa. 2012. Calibration-free rolling shutter removal. In Proceedings of the IEEE International Conference on Intelligent Computer Communication and Processing (ICCP’12). 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  9. Matthias Grundmann, Vivek Kwatra, and Irfan Essa. 2011. Auto-directed video stabilization with robust l1 optimal camera paths. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 225--232.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  11. Hua Huang, Xiao-Xiang Wei, and Lei Zhang. 2018. Encoding shaky videos by integrating efficient video stabilization. IEEE Trans. Circ. Syst. Vid. Technol. 29, 5 (2018).Google ScholarGoogle Scholar
  12. Alexandre Karpenko, David Jacobs, Jongmin Baek, and Marc Levoy. 2011. Digital Video Stabilization and Rolling Shutter Correction Using Gyroscopes. Stanford Tech Report CTSR 2011-03.Google ScholarGoogle Scholar
  13. Feng Liu, Michael Gleicher, Hailin Jin, and Aseem Agarwala. 2009. Content-preserving warps for 3D video stabilization. ACM Trans. Graph. 28, 3 (2009), 44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Feng Liu, Michael Gleicher, Jue Wang, Hailin Jin, and Aseem Agarwala. 2011. Subspace video stabilization. ACM Trans. Graph. 30, 1 (2011), 4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Feng Liu, Yuzhen Niu, and Hailin Jin. 2013a. Joint subspace stabilization for stereoscopic video. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’13). 73--80.Google ScholarGoogle ScholarCross RefCross Ref
  16. Shuaicheng Liu, Mingyu Li, Shuyuan Zhu, and Bing Zeng. 2017. CodingFlow: Enable video coding for video stabilization. IEEE Trans. Image Proc. 26, 7 (2017), 3291--3302.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Shuaicheng Liu, Ping Tan, Lu Yuan, Jian Sun, and Bing Zeng. 2016. Meshflow: Minimum latency online video stabilization. In Proceedings of the European Conference on Computer Vision (ECCV’16). 800--815.Google ScholarGoogle ScholarCross RefCross Ref
  18. Shuaicheng Liu, Yinting Wang, Lu Yuan, Jiajun Bu, Ping Tan, and Jian Sun. 2012. Video stabilization with a depth camera. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12). 89--95.Google ScholarGoogle Scholar
  19. Shuaicheng Liu, Lu Yuan, Ping Tan, and Jian Sun. 2013b. Bundled camera paths for video stabilization. ACM Trans. Graph. 32, 4 (2013), 78.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Shuaicheng Liu, Lu Yuan, Ping Tan, and Jian Sun. 2014. Steadyflow: Spatially smooth optical flow for video stabilization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 4209--4216.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Gucan Long, Laurent Kneip, Jose M. Alvarez, Hongdong Li, Xiaohu Zhang, and Qifeng Yu. 2016. Learning image matching by simply watching video. In Proceedings of the European Conference on Computer Vision (ECCV’16). 434--450.Google ScholarGoogle ScholarCross RefCross Ref
  22. Michael Mathieu, Camille Couprie, and Yann LeCun. 2016. Deep multi-scale video prediction beyond mean square error. In Proceedings of the International Conference on Learning Representations (ICLR’16).Google ScholarGoogle Scholar
  23. Yasuyuki Matsushita, Eyal Ofek, Weina Ge, Xiaoou Tang, and Heung-Yeung Shum. 2006. Full-frame video stabilization with motion inpainting. IEEE Trans. Pattern Anal. Mach. Intell. 28, 7 (2006), 1150--1163.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Simone Meyer, Oliver Wang, Henning Zimmer, Max Grosse, and Alexander Sorkine-Hornung. 2015. Phase-based frame interpolation for video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 1410--1418.Google ScholarGoogle ScholarCross RefCross Ref
  25. Simon Niklaus and Feng Liu. 2018. Context-aware synthesis for video frame interpolation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18).Google ScholarGoogle ScholarCross RefCross Ref
  26. Simon Niklaus, Long Mai, and Feng Liu. 2017a. Video frame interpolation via adaptive convolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 1. 3.Google ScholarGoogle Scholar
  27. Simon Niklaus, Long Mai, and Feng Liu. 2017b. Video frame interpolation via adaptive separable convolution. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17).Google ScholarGoogle ScholarCross RefCross Ref
  28. Hannes Ovrén and Per-Erik Forssén. 2015. Gyroscope-based video stabilisation with auto-calibration. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’15). 2090--2097.Google ScholarGoogle ScholarCross RefCross Ref
  29. F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung. 2016. A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).Google ScholarGoogle Scholar
  30. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention. Springer, 234--241.Google ScholarGoogle ScholarCross RefCross Ref
  31. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  32. Brandon M. Smith, Li Zhang, Hailin Jin, and Aseem Agarwala. 2009. Light field video stabilization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’09). 341--348.Google ScholarGoogle ScholarCross RefCross Ref
  33. Deqing Sun, Xiaodong Yang, Ming-Yu Liu, and Jan Kautz. 2018. PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18).Google ScholarGoogle ScholarCross RefCross Ref
  34. Miao Wang, Guo-Ye Yang, Jin-Kun Lin, Ariel Shamir, Song-Hai Zhang, Shao-Ping Lu, and Shi-Min Hu. 2018. Deep online video stabilization with multi-grid warping transformation learning. IEEE Trans. Image Proc. 28, 5 (2018), 2283--2292.Google ScholarGoogle ScholarCross RefCross Ref
  35. Yu-Shuen Wang, Feng Liu, Pu-Sheng Hsu, and Tong-Yee Lee. 2013. Spatially and temporally optimized video stabilization. Proceedings of the IEEE Trans. on Vis. Comput. Graph. 19, 8 (2013), 1354--1361.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Sen-Zhe Xu, Jun Hu, Miao Wang, Tai-Jiang Mu, and Shi-Min Hu. 2018. Deep video stabilization using adversarial networks. In Proceedings of the Computer Graphics Forum, Vol. 37. 267--276.Google ScholarGoogle ScholarCross RefCross Ref
  37. Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2018. Free-form image inpainting with gated convolution. Retrieved from arXiv preprint arXiv:1806.03589 (2018).Google ScholarGoogle Scholar
  38. Zihan Zhou, Hailin Jin, and Yi Ma. 2013. Plane-based content preserving warps for video stabilization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13). 2299--2306.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deep Iterative Frame Interpolation for Full-frame Video Stabilization

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 39, Issue 1
      February 2020
      112 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/3366374
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 January 2020
      • Accepted: 1 September 2019
      • Revised: 1 July 2019
      • Received: 1 April 2019
      Published in tog Volume 39, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format