Abstract
A common way to view a 360° video on a 2D display is to crop and render a part of the video as a normal field-of-view (NFoV) video. While users can enjoy natural-looking NFoV videos using this approach, they need to constantly make manual adjustment of the viewing direction not to miss interesting events in the video. In this paper, we propose an interactive and automatic navigation system for comfortable 360° video playback. Our system finds a virtual camera path that shows the most salient areas through the video, generates a NFoV video based on the path, and plays it in an online manner. A user can interactively change the viewing direction while watching a video, and the system instantly updates the path reflecting the intention of the user. To enable online processing, we design our system consisting of an offline pre-processing step, and an online 360° video navigation step. The pre-processing step computes optical flow and saliency scores for an input video. Based on these, the online video navigation step computes an optimal camera path reflecting user interaction, and plays a NFoV video in an online manner. For improved user experience, we also introduce optical flow-based camera path planning, saliency-aware path update, and adaptive control of the temporal window size. Our experimental results including user studies show that our system provides more pleasant experience of watching 360° videos than existing approaches.
Supplemental Material
Available for Download
Supplemental material
- Marc Assens, Xavier Giro-i Nieto, Kevin McGuinness, and Noel E O'Connor. 2017. SaltiNet: Scan-Path Prediction on 360 Degree Images Using Saliency Volumes. In 2017 IEEE International Conference on Computer Vision Workshops (ICCVW). 2331--2338.Google Scholar
Cross Ref
- Jiamin Bai, Aseem Agarwala, Maneesh Agrawala, and Ravi Ramamoorthi. 2014. User-Assisted Video Stabilization. Comput. Graph. Forum 33, 4 (2014), 61--70.Google Scholar
Digital Library
- Hsien-Tzu Cheng, Chun-Hung Chao, Jin-Dong Dong, Hao-Kai Wen, Tyng-Luh Liu, and Min Sun. 2018. Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1420--1429.Google Scholar
Cross Ref
- Thomas Deselaers, Philippe Dreuw, and Hermann Ney. 2008. Pan, zoom, scan - Time- coherent, trained automatic video cropping. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. 1--8.Google Scholar
Cross Ref
- Michael L Gleicher and Feng Liu. 2007. Re-cinematography: Improving the Camera Dynamics of Casual Video. In Proceedings of the 15th ACM International Conference on Multimedia (MM '07). 27--36. Google Scholar
Digital Library
- Michael L Gleicher and Feng Liu. 2008. Re-cinematography: Improving the Camerawork of Casual Video. ACM Trans. Multimedia Comput. Commun. Appl. 5, 1, Article 2 (2008), 28 pages. Google Scholar
Digital Library
- Amit Goldstein and Raanan Fattal. 2012. Video Stabilization Using Epipolar Geometry. ACM Trans. Graph. 31, 5, Article 126 (2012), 10 pages. Google Scholar
Digital Library
- Matthias Grundmann, Vivek Kwatra, and Irfan Essa. 2011. Auto-directed video stabilization with robust L1 optimal camera paths. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. 225--232. Google Scholar
Digital Library
- Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, and Min Sun. 2017. Deep 360 pilot: Learning a deep agent for piloting through 360 sports video. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1396--1405.Google Scholar
Cross Ref
- Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, and Thomas Brox. 2017. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1647--1655.Google Scholar
- Junho Jeon, Jinwoong Jung, and Seungyong Lee. 2018. Deep Upright Adjustment of 360 Panoramas using Multiple Roll Estimations. In Proceedings of the Asian Conference on Computer Vision (ACCV).Google Scholar
- Wei Jiang, Zhenyu Wu, John Wus, and Heather Yu. 2014. One-Pass Video Stabilization on Mobile Devices. In Proceedings of the 22Nd ACM International Conference on Multimedia (MM '14). 817--820. Google Scholar
Digital Library
- Jinwoong Jung, Beomseok Kim, Joon-Young Lee, Byungmoon Kim, and Seungyong Lee. 2017. Robust Upright Adjustment of 360 Spherical Panoramas. Vis. Comput. 33, 6--8 (2017), 737--747. Google Scholar
Digital Library
- Yeong Won Kim, Chang-Ryeol Lee, Dae-Yong Cho, Yong Hoon Kwon, Hyeok-Jae Choi, and Kuk-Jin Yoon. 2017. Automatic Content-Aware Projection for 360deg Videos. In 2017 IEEE International Conference on Computer Vision (ICCV). 4753--4761.Google Scholar
Cross Ref
- Johannes Kopf. 2016. 360° Video Stabilization. ACM Trans. Graph. 35, 6, Article 195 (2016), 9 pages. Google Scholar
Digital Library
- Wei-Sheng Lai, Yujia Huang, Neel Joshi, Christopher Buehler, Ming-Hsuan Yang, and Sing Bing Kang. 2017. Semantic-driven generation of hyperlapse from 360 video. IEEE Transactions on Visualization and Computer Graphics, PP (99) (2017), 1--1.Google Scholar
- Yen-Chen Lin, Yung-Ju Chang, Hou-Ning Hu, Hsien-Tzu Cheng, Chi-Wen Huang, and Min Sun. 2017a. Tell Me Where to Look: Investigating Ways for Assisting Focus in 360° Video. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). 2535--2545. Google Scholar
Digital Library
- Yung-Ta Lin, Yi-Chi Liao, Shan-Yuan Teng, Yi-Ju Chung, Liwei Chan, and Bing-Yu Chen. 2017b. Outside-In: Visualizing Out-of-Sight Regions-of-Interest in a 360° Video Using Spatial Picture-in-Picture Previews. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (UIST '17). 255--265. Google Scholar
Digital Library
- Ce Liu. 2009. Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. Dissertation. Massachusetts Institute of Technology. Google Scholar
Digital Library
- Feng Liu and Michael Gleicher. 2006. Video Retargeting: Automating Pan and Scan. In Proceedings of the 14th ACM International Conference on Multimedia (MM '06). 241--250. Google Scholar
Digital Library
- Feng Liu, Michael Gleicher, Hailin Jin, and Aseem Agarwala. 2009. Content-preserving Warps for 3D Video Stabilization. ACM Trans. Graph. 28, 3, Article 44 (2009), 9 pages. Google Scholar
Digital Library
- Feng Liu, Michael Gleicher, Jue Wang, Hailin Jin, and Aseem Agarwala. 2011. Subspace Video Stabilization. ACM Trans. Graph. 30, 1, Article 4 (2011), 10 pages. Google Scholar
Digital Library
- Shuaicheng Liu, Ping Tan, Lu Yuan, Jian Sun, and Bing Zeng. 2016. In European Conference on Computer Vision, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). 800--815.Google Scholar
- Shuaicheng Liu, Lu Yuan, Ping Tan, and Jian Sun. 2013. Bundled Camera Paths for Video Stabilization. ACM Trans. Graph. 32, 4, Article 78 (2013), 10 pages. Google Scholar
Digital Library
- Yasuyuki Matsushita, Eyal Ofek, Weina Ge, Xiaoou Tang, and Heung-Yeung Shum. 2006. Full-frame video stabilization with motion inpainting. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 7 (2006), 1150--1163. Google Scholar
Digital Library
- Amy Pavel, Björn Hartmann, and Maneesh Agrawala. 2017. Shot Orientation Controls for Interactive Cinematography with 360 Video. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (UIST '17). 289--297. Google Scholar
Digital Library
- Michael Rubinstein, Ariel Shamir, and Shai Avidan. 2008. Improved Seam Carving for Video Retargeting. ACM Trans. Graph. 27, 3, Article 16 (2008), 9 pages. Google Scholar
Digital Library
- Michael Rubinstein, Ariel Shamir, and Shai Avidan. 2009. Multi-operator Media Retargeting. ACM Trans. Graph. 28, 3, Article 23 (2009), 11 pages. Google Scholar
Digital Library
- Yu-Chuan Su and Kristen Grauman. 2017a. Learning Spherical Convolution for Fast Features from 360°Imagery. In Advances in Neural Information Processing Systems 30. 529--539. Google Scholar
Digital Library
- Yu-Chuan Su and Kristen Grauman. 2017b. Making 360° Video Watchable in 2D: Learning Videography for Click Free Viewing. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1368--1376.Google Scholar
Cross Ref
- Yu-Chuan Su and Kristen Grauman. 2018. Learning Compressible 360° Video Isomers. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7824--7833.Google Scholar
Cross Ref
- Yu-Chuan Su, Dinesh Jayaraman, and Kristen Grauman. 2016. Pano2Vid: Automatic Cinematography for Watching 360° Videos. In Proceedings of the Asian Conference on Computer Vision (ACCV). 154--171.Google Scholar
- Yu-Shuen Wang, Jen-Hung Hsiao, Olga Sorkine, and Tong-Yee Lee. 2011. Scalable and Coherent Video Resizing with Per-frame Optimization. ACM Trans. Graph. 30, 4, Article 88 (2011), 8 pages. Google Scholar
Digital Library
- Yu-Shuen Wang, Hui-Chih Lin, Olga Sorkine, and Tong-Yee Lee. 2010. Motion-based Video Retargeting with Optimized Crop-and-warp. ACM Trans. Graph. 29, 4, Article 90 (2010), 9 pages. Google Scholar
Digital Library
- Lior Wolf, Moshe Guttmann, and Daniel Cohen-Or. 2007. Non-homogeneous Content-driven Video-retargeting. In 2007 IEEE 11th International Conference on Computer Vision. 1--6.Google Scholar
- Stephen Wright and Jorge Nocedal. 2006. Numerical Optimization (2 ed.).Google Scholar
- Feng Zhou, Sing Bing Kang, and Michael F Cohen. 2014. Time-mapping using spacetime saliency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3358--3365. Google Scholar
Digital Library
Index Terms
Interactive and automatic navigation for 360° video playback
Recommendations
Enhanced Interactive 360° Viewing via Automatic Guidance
We present a new interactive playback method to enhance 360° viewing experiences. Our method automatically rotates the virtual camera of a 360° panoramic video (360° video) player during interactive viewing to guide the viewer through the most ...
Towards optimal navigation through video content on interactive TV
A wide variety of video content-news programs, documentaries, sports shows, movies, and the like-is broadcast today in digital format to interactive TVs. Unlike a conventional TV, an interactive TV allows the viewer to navigate back and forth in time ...
Viewport-driven DASH media playback for interactive storytelling: a seamless non-linear storyline experience
MMSys '19: Proceedings of the 10th ACM Multimedia Systems ConferenceOver the past few years, the HTTP Adaptive Streaming (HAS) technologies, e.g. the MPEG-DASH standard (DASH), became the predominant form of online video streaming. 360 Virtual Reality (VR) content have recently emerged on video streaming platform as a ...





Comments