Abstract
Video stabilization is a fundamental and important technique for higher quality videos. Prior works have extensively explored video stabilization, but most of them involve cropping of the frame boundaries and introduce moderate levels of distortion. We present a novel deep approach to video stabilization that can generate video frames without cropping and low distortion. The proposed framework utilizes frame interpolation techniques to generate in between frames, leading to reduced inter-frame jitter. Once applied in an iterative fashion, the stabilization effect becomes stronger. A major advantage is that our framework is end-to-end trainable in an unsupervised manner. In addition, our method is able to run in near real-time (15 fps). To the best of our knowledge, this is the first work to propose an unsupervised deep approach to full-frame video stabilization. We show the advantages of our method through quantitative and qualitative evaluations comparing to the state-of-the-art methods.
- Jiamin Bai, Aseem Agarwala, Maneesh Agrawala, and Ravi Ramamoorthi. 2014. User-assisted video stabilization. In Proceedings of the Computer Graphics Forum, Vol. 33. 61--70.Google Scholar
Digital Library
- Steven Bell, Alejandro Troccoli, and Kari Pulli. 2014. A non-linear filter for gyroscope-based video stabilization. In Proceedings of the European Conference on Computer Vision (ECCV’14). 294--308.Google Scholar
Cross Ref
- Chris Buehler, Michael Bosse, and Leonard McMillan. 2001. Non-metric image-based rendering for video stabilization. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR’01), Vol. 2. 609--614.Google Scholar
Cross Ref
- Bing-Yu Chen, Ken-Yi Lee, Wei-Ting Huang, and Jong-Shan Lin. 2008. Capturing intention-based full-frame video stabilization. In Computer Graphics Forum, Vol. 27. 1805--1814.Google Scholar
Cross Ref
- Michael L. Gleicher and Feng Liu. 2007. Re-cinematography: Improving the camera dynamics of casual video. In Proceedings of the ACM International Conference on Multimedia. 27--36.Google Scholar
- Amit Goldstein and Raanan Fattal. 2012. Video stabilization using epipolar geometry. ACM Trans. Graph. 31, 5 (2012), 126.Google Scholar
Digital Library
- Ross Goroshin, Michael F. Mathieu, and Yann LeCun. 2015. Learning to linearize under uncertainty. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS’15). 1234--1242.Google Scholar
- Matthias Grundmann, Vivek Kwatra, Daniel Castro, and Irfan Essa. 2012. Calibration-free rolling shutter removal. In Proceedings of the IEEE International Conference on Intelligent Computer Communication and Processing (ICCP’12). 1--8.Google Scholar
Cross Ref
- Matthias Grundmann, Vivek Kwatra, and Irfan Essa. 2011. Auto-directed video stabilization with robust l1 optimal camera paths. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 225--232.Google Scholar
Digital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 770--778.Google Scholar
Cross Ref
- Hua Huang, Xiao-Xiang Wei, and Lei Zhang. 2018. Encoding shaky videos by integrating efficient video stabilization. IEEE Trans. Circ. Syst. Vid. Technol. 29, 5 (2018).Google Scholar
- Alexandre Karpenko, David Jacobs, Jongmin Baek, and Marc Levoy. 2011. Digital Video Stabilization and Rolling Shutter Correction Using Gyroscopes. Stanford Tech Report CTSR 2011-03.Google Scholar
- Feng Liu, Michael Gleicher, Hailin Jin, and Aseem Agarwala. 2009. Content-preserving warps for 3D video stabilization. ACM Trans. Graph. 28, 3 (2009), 44.Google Scholar
Digital Library
- Feng Liu, Michael Gleicher, Jue Wang, Hailin Jin, and Aseem Agarwala. 2011. Subspace video stabilization. ACM Trans. Graph. 30, 1 (2011), 4.Google Scholar
Digital Library
- Feng Liu, Yuzhen Niu, and Hailin Jin. 2013a. Joint subspace stabilization for stereoscopic video. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’13). 73--80.Google Scholar
Cross Ref
- Shuaicheng Liu, Mingyu Li, Shuyuan Zhu, and Bing Zeng. 2017. CodingFlow: Enable video coding for video stabilization. IEEE Trans. Image Proc. 26, 7 (2017), 3291--3302.Google Scholar
Digital Library
- Shuaicheng Liu, Ping Tan, Lu Yuan, Jian Sun, and Bing Zeng. 2016. Meshflow: Minimum latency online video stabilization. In Proceedings of the European Conference on Computer Vision (ECCV’16). 800--815.Google Scholar
Cross Ref
- Shuaicheng Liu, Yinting Wang, Lu Yuan, Jiajun Bu, Ping Tan, and Jian Sun. 2012. Video stabilization with a depth camera. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12). 89--95.Google Scholar
- Shuaicheng Liu, Lu Yuan, Ping Tan, and Jian Sun. 2013b. Bundled camera paths for video stabilization. ACM Trans. Graph. 32, 4 (2013), 78.Google Scholar
Digital Library
- Shuaicheng Liu, Lu Yuan, Ping Tan, and Jian Sun. 2014. Steadyflow: Spatially smooth optical flow for video stabilization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 4209--4216.Google Scholar
Digital Library
- Gucan Long, Laurent Kneip, Jose M. Alvarez, Hongdong Li, Xiaohu Zhang, and Qifeng Yu. 2016. Learning image matching by simply watching video. In Proceedings of the European Conference on Computer Vision (ECCV’16). 434--450.Google Scholar
Cross Ref
- Michael Mathieu, Camille Couprie, and Yann LeCun. 2016. Deep multi-scale video prediction beyond mean square error. In Proceedings of the International Conference on Learning Representations (ICLR’16).Google Scholar
- Yasuyuki Matsushita, Eyal Ofek, Weina Ge, Xiaoou Tang, and Heung-Yeung Shum. 2006. Full-frame video stabilization with motion inpainting. IEEE Trans. Pattern Anal. Mach. Intell. 28, 7 (2006), 1150--1163.Google Scholar
Digital Library
- Simone Meyer, Oliver Wang, Henning Zimmer, Max Grosse, and Alexander Sorkine-Hornung. 2015. Phase-based frame interpolation for video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 1410--1418.Google Scholar
Cross Ref
- Simon Niklaus and Feng Liu. 2018. Context-aware synthesis for video frame interpolation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18).Google Scholar
Cross Ref
- Simon Niklaus, Long Mai, and Feng Liu. 2017a. Video frame interpolation via adaptive convolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 1. 3.Google Scholar
- Simon Niklaus, Long Mai, and Feng Liu. 2017b. Video frame interpolation via adaptive separable convolution. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17).Google Scholar
Cross Ref
- Hannes Ovrén and Per-Erik Forssén. 2015. Gyroscope-based video stabilisation with auto-calibration. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’15). 2090--2097.Google Scholar
Cross Ref
- F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung. 2016. A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).Google Scholar
- Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention. Springer, 234--241.Google Scholar
Cross Ref
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- Brandon M. Smith, Li Zhang, Hailin Jin, and Aseem Agarwala. 2009. Light field video stabilization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’09). 341--348.Google Scholar
Cross Ref
- Deqing Sun, Xiaodong Yang, Ming-Yu Liu, and Jan Kautz. 2018. PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18).Google Scholar
Cross Ref
- Miao Wang, Guo-Ye Yang, Jin-Kun Lin, Ariel Shamir, Song-Hai Zhang, Shao-Ping Lu, and Shi-Min Hu. 2018. Deep online video stabilization with multi-grid warping transformation learning. IEEE Trans. Image Proc. 28, 5 (2018), 2283--2292.Google Scholar
Cross Ref
- Yu-Shuen Wang, Feng Liu, Pu-Sheng Hsu, and Tong-Yee Lee. 2013. Spatially and temporally optimized video stabilization. Proceedings of the IEEE Trans. on Vis. Comput. Graph. 19, 8 (2013), 1354--1361.Google Scholar
Digital Library
- Sen-Zhe Xu, Jun Hu, Miao Wang, Tai-Jiang Mu, and Shi-Min Hu. 2018. Deep video stabilization using adversarial networks. In Proceedings of the Computer Graphics Forum, Vol. 37. 267--276.Google Scholar
Cross Ref
- Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2018. Free-form image inpainting with gated convolution. Retrieved from arXiv preprint arXiv:1806.03589 (2018).Google Scholar
- Zihan Zhou, Hailin Jin, and Yi Ma. 2013. Plane-based content preserving warps for video stabilization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13). 2299--2306.Google Scholar
Digital Library
Index Terms
Deep Iterative Frame Interpolation for Full-frame Video Stabilization
Recommendations
Full-Frame Video Stabilization with Motion Inpainting
Video stabilization is an important video enhancement technology which aims at removing annoying shaky motion from videos. We propose a practical and robust approach of video stabilization that produces full-frame stabilized videos with good visual ...
Video frame interpolation via optical flow estimation with image inpainting
AbstractAs we all know, video frame rate determines the quality of the video. The higher the frame rate, the smoother the movements in the picture, the clearer the information expressed, and the better the viewing experience for people. Video ...
Fast frame-rate up-conversion of depth video via video coding
MM '11: Proceedings of the 19th ACM international conference on MultimediaRecent development of depth sensors has facilitated the progress of 2D-plus-depth methods for 3D video representation, for which frame-rate up-conversion (FRUC) of depth video is a critical step. However, due to the computational cost of state-of-the-...





Comments