Abstract
Three-hundred-sixty-degree (360°) video provides an immersive experience for viewers, allowing them to freely explore the world by turning their head. However, creating high-quality 360° video content can be challenging, as viewers may miss important events by looking in the wrong direction, or they may see things that ruin the immersion, such as stitching artifacts and the film crew. We take advantage of the fact that not all directions are equally likely to be observed; most viewers are more likely to see content located at “true north,” i.e., in front of them, due to ergonomic constraints. We therefore propose 360° video direction, where the video is jointly optimized to orient important events to the front of the viewer and visual clutter behind them, while producing smooth camera motion. Unlike traditional video, viewers can still explore the space as desired, but with the knowledge that the most important content is likely to be in front of them. Constraints can be user guided, either added directly on the equirectangular projection or by recording “guidance” viewing directions while watching the video in a VR headset or automatically computed, such as via visual saliency or forward-motion direction. To accomplish this, we propose a new motion estimation technique specifically designed for 360° video that outperforms the commonly used five-point algorithm on wide-angle video. We additionally formulate the direction problem as an optimization where a novel parametrization of spherical warping allows us to correct for some degree of parallax effects. We compare our approach to recent methods that address stabilization-only and converting 360° video to narrow field-of-view video. Our pipeline can also enable the viewing of wide-angle non-360° footage in a spherical 360° space, giving an immersive “virtual cinema” experience for a wide range of existing content filmed with first-person cameras.
- Sameer Agarwal, Keir Mierle, and others. 2017. Ceres Solver. Retrieved from https://code.google.com/p/ceres-solver/.Google Scholar
- Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernández, Sameer Agarwal, and Steven M. Seitz. 2016. Jump: Virtual reality video. ACM Trans. Graph. 35, 6, Article 198 (Nov. 2016). Google Scholar
Digital Library
- Chris Buehler, Michael Bosse, and Leonard McMillan. 2001. Non-metric image-based rendering for video stabilization. In Proceedings of the 2001 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Vol. 2. II--II.Google Scholar
Cross Ref
- Bing-Yu Chen, Ken-Yi Lee, Wei-Ting Huang, and Jong-Shan Lin. 2008. Capturing intention-based full-frame video stabilization. Comput. Graph. Forum 27, 7 (2008).Google Scholar
- Y. Dai, H. Li, and L. Kneip. 2016. Rolling shutter camera relative pose: Generalized epipolar geometry. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 4132--4140.Google Scholar
- Vineet Gandhi, Remi Ronfard, and Michael Gleicher. 2014. Multi-clip video editing from a single viewpoint. In Proceedings of the ACM SIGGRAPH European Conference on Visual Media Production (CVMP’14). 9:1--9:10. Google Scholar
Digital Library
- Michael L. Gleicher and Feng Liu. 2008. Re-cinematography: Improving the camerawork of casual video. ACM Trans. Multimedia Comput. Commun. Appl. 5, 1, Article 2 (Oct. 2008). Google Scholar
Digital Library
- Amit Goldstein and Raanan Fattal. 2012. Video stabilization using epipolar geometry. ACM Trans. Graph. 31, 5, Article 126 (Sep. 2012). Google Scholar
Digital Library
- Jeremy J. Gray. 1980. Olinde Rodrigues’ paper of 1840 on transformation groups. Arch. Hist. Exact Sci. 21, 4 (Dec. 1980), 375--385.Google Scholar
Cross Ref
- Matthias Grundmann, Vivek Kwatra, and Irfan Essa. 2011. Auto-directed video stabilization with robust L1 optimal camera paths. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 225--232. Google Scholar
Digital Library
- R. I. Hartley and A. Zisserman. 2004. Multiple View Geometry in Computer Vision (2nd ed.). Cambridge University Press. Google Scholar
Digital Library
- Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, and Min Sun. 2017. Deep 360 pilot: Learning a deep agent for piloting through 360 sports video. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 1396--1405.Google Scholar
Cross Ref
- Eakta Jain, Yaser Sheikh, Ariel Shamir, and Jessica Hodgins. 2015. Gaze-driven video re-editing. ACM Trans. Graph. 34, 2, Article 21 (Mar. 2015). Google Scholar
Digital Library
- Jason Jerald. 2016. The VR Book: Human-Centered Design for Virtual Reality. Association for Computing Machinery and Morgan 8 Claypool. Google Scholar
Digital Library
- L. Kneip and S. Lynen. 2013. Direct optimization of frame-to-frame rotation. In Proceedings of the 2013 IEEE International Conference on Computer Vision (CVPR’13). 2352--2359. Google Scholar
Digital Library
- Laurent Kneip, Roland Siegwart, and Marc Pollefeys. 2012. Finding the exact rotation between two images independently of the translation. In Proceedings of the 2012 European Conference on Computer Vision (ECCV’12). 696--709. Google Scholar
Digital Library
- Johannes Kopf. 2016. 360° video stabilization. ACM Trans. Graph. 35, 6, Article 195 (Nov. 2016). Google Scholar
Digital Library
- Johannes Kopf, Michael F. Cohen, and Richard Szeliski. 2014. First-person hyper-lapse videos. ACM Trans. Graph. 33, 4, Article 78 (Jul. 2014). Google Scholar
Digital Library
- Jungjin Lee, Bumki Kim, Kyehyun Kim, Younghui Kim, and Junyong Noh. 2016. Rich360: Optimized spherical representation from structured panoramic camera arrays. ACM Trans. Graph. 35, 4, Article 63 (Jul. 2016). Google Scholar
Digital Library
- Ken-Yi Lee, Yung-Yu Chuang, Bing-Yu Chen, and Ming Ouhyoung. 2009. Video stabilization using robust feature trajectories. In Proceedings of the 2009 IEEE International Conference on Computer Vision (ICCV’09). 1397--1404.Google Scholar
Cross Ref
- Hongdong Li and Richard Hartley. 2006. Five-point motion estimation made easy. In Proceedings of the 2006 International Conference on Pattern Recognition (ICPR’06), Vol. 1. 630--633. Google Scholar
Digital Library
- Feng Liu, Michael Gleicher, Hailin Jin, and Aseem Agarwala. 2009. Content-preserving warps for 3d video stabilization. ACM Trans. Graph. 28, 3, Article 44 (Jul. 2009). Google Scholar
Digital Library
- Feng Liu, Michael Gleicher, Jue Wang, Hailin Jin, and Aseem Agarwala. 2011. Subspace video stabilization. ACM Trans. Graph. 30, 1, Article 4 (Feb. 2011). Google Scholar
Digital Library
- Shuaicheng Liu, Lu Yuan, Ping Tan, and Jian Sun. 2013. Bundled camera paths for video stabilization. ACM Trans. Graph. 32, 4, Article 78 (Jul. 2013). Google Scholar
Digital Library
- I. Scott MacKenzie. 2013. Human--Computer Interaction: An Empirical Research Perspective (1st ed.). Morgan Kaufmann Publishers Inc. Google Scholar
Digital Library
- Yasuyuki Matsushita, Eyal Ofek, Weina Ge, Xiaoou Tang, and Heung-Yeung Shum. 2006. Full-frame video stabilization with motion inpainting. IEEE Trans. Pattern Anal. Mach. Intell. 28, 7 (Jul. 2006), 1150--1163. Google Scholar
Digital Library
- D. Nister. 2004. An efficient solution to the five-point relative pose problem. IEEE Trans. Pattern Anal. Mach. Intell. 26, 6 (Jun. 2004), 756--770. Google Scholar
Digital Library
- Ana Serrano, Vincent Sitzmann, Jaime Ruiz-Borau, Gordon Wetzstein, Diego Gutierrez, and Belen Masia. 2017. Movie Editing and Cognitive Event Segmentation in Virtual Reality Video. ACM Trans. Graph. 36, 4, Article 47 (Jul. 2017). Google Scholar
Digital Library
- Jianbo Shi and Carlo Tomasi. 1994. Good features to track. In Proceedings of the 1994 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’94). 593--600.Google Scholar
- V. Sitzmann, A. Serrano, A. Pavel, M. Agrawala, D. Gutierrez, B. Masia, and G. Wetzstein. 2018. Saliency in VR: How do people explore virtual environments? IEEE Trans. Vis. Comput. Graph. 24, 4 (Apr. 2018), 1633--1642. Google Scholar
Digital Library
- H. Stewenius, D. Nister, F. Kahl, and F. Schaffalitzky. 2005. A minimal solution for relative pose with unknown focal length. In Proceedings of the 2005 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’05). 789--794. Google Scholar
Digital Library
- Hans Strasburger, Ingo Rentschler, and Martin Jüttner. 2011. Peripheral vision and pattern recognition: A review. J. Vis. 11, 5 (2011), 13.Google Scholar
Cross Ref
- Y. Su and K. Grauman. 2017. Making 360° video watchable in 2D: Learning videography for click free viewing. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 1368--1376.Google Scholar
- Yu-Chuan Su, Dinesh Jayaraman, and Kristen Grauman. 2016. Pano2Vid: Automatic cinematography for watching 360° videos. In Proceedings of the 2016 Asian Conference on Computer Vision (ACCV’16).Google Scholar
- Qi Sun, Li-Yi Wei, and Arie Kaufman. 2016. Mapping virtual and physical reality. ACM Trans. Graph. 35, 4, Article 64 (Jul. 2016). Google Scholar
Digital Library
- Bill Triggs, Philip F. McLauchlan, Richard I. Hartley, and Andrew W. Fitzgibbon. 2000. Bundle adjustment - A modern synthesis. In Proceedings of the 2000 International Workshop on Vision Algorithms: Theory and Practice. 298--372. Google Scholar
Digital Library
- Yu-Shuen Wang, Hongbo Fu, Olga Sorkine, Tong-Yee Lee, and Hans-Peter Seidel. 2009. Motion-aware temporal coherence for video resizing. ACM Trans. Graph. 28, 5, Article 127 (Dec. 2009). Google Scholar
Digital Library
Index Terms
Joint Stabilization and Direction of 360° Videos





Comments