skip to main content
research-article

Machine Learning Based Content-Agnostic Viewport Prediction for 360-Degree Video

Authors Info & Claims
Published:16 February 2022Publication History
Skip Abstract Section

Abstract

Accurate and fast estimations or predictions of the (near) future location of the users of head-mounted devices within the virtual omnidirectional environment open a plethora of opportunities in application domains such as interactive immersive gaming and tele-surgery. Therefore, the past years have seen growing attention to models for viewport prediction in 360֯ environments. Among the approaches, content-agnostic, trajectory-based methods have the potential to provide very fast solutions, as they do not require complex analysis of the videos to provide a prediction. However, accurate trajectory-based viewport prediction is rather difficult due to the intrinsic variability in user behaviour. Furthermore, even when making use of machine learning, current approaches tend to be brute-force and heavily tailored to specific datasets with little comparison to existing benchmarks or publicly available studies. This article presents a generic, content-agnostic viewport prediction method consisting of a window-based approach combined with a preprocessing system to classify behavioural patterns in terms of user clustering and trajectory correlation. Moreover, as the state of the art does not provide a comparative analysis of different approaches, this work contributes to this. Based on the obtained results, a combined prediction model is proposed and evaluated. Our method shows a 36.8% to 53.9% improvement when compared to the static prediction baseline for a prediction horizon of 8 seconds. In addition, a 11.5% to 24.0% improvement to a brute-force machine learning prediction approach is obtained. As such, this work contributes towards the creation of more generic and structured solutions for content-agnostic viewport prediction in terms of data representation, preprocessing and modelling.

REFERENCES

  1. [1] Atev S., Miller G., and Papanikolopoulos N. P.. 2010. Clustering of vehicle trajectories. IEEE Transactions on Intelligent Transportation Systems 11, 3 (2010), 647657. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Aykut T., Xu J., and Steinbach E.. 2019. Realtime 3D 360-degree telepresence with deep-learning-based head-motion prediction. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9, 1 (2019), 231244.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Ban Y., Xie L., Xu Z., Zhang X., Guo Z., and Wang Y.. 2018. CUB360: Exploiting cross-users behaviors for viewport prediction in 360 video adaptive streaming. In Proceedings of the 2018 IEEE International Conference on Multimedia and Expo (ICME’18). IEEE, Los Alamitos, CA, 16.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Bao Y., Wu H., Zhang T., Ramli A. A., and Liu X.. 2016. Shooting a moving target: Motion-prediction-based transmission for 360-degree videos. In Proceedings of the 2016 IEEE International Conference on Big Data (Big Data’16). IEEE, Los Alamitos, CA, 11611170.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Chen X., Kasgari A. T. Z., and Saad W.. 2020. Deep learning for content-based personalized viewport prediction of 360-degree VR videos. IEEE Networking Letters 2, 2 (2020), 8184.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Corbillon X., Simone F. De, and Simon G.. 2017. 360-degree video head movement dataset. In Proceedings of the 8th ACM Conference on Multimedia Systems (MMSys’17). ACM, New York, NY, 199204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Ester M., Kriegel H., Sander J., and Xu X.. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD’96). 226231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Fan C., Lee J., Lo W., Huang C., Chen K., and Hsu C.. 2017. Fixation prediction for 360\(^\circ\) video streaming in head-mounted virtual reality. In Proceedings of the 27th Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV’17). ACM, New York, NY, 6772. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Feng X., Bao Z., and Wei S.. 2019. Exploring CNN-based viewport prediction for live virtual reality streaming. In Proceedings of the 2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR’19). IEEE, Los Alamitos, CA, 1831833.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Feng X., Liu Y., and Wei S.. 2020. LiveDeep: Online viewport prediction for live virtual reality streaming using lifelong deep learning. In Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces (VR’20). IEEE, Los Alamitos, CA, 800808.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Heyse J., Vega M. Torres, Backere F. De, and Turck F. De. 2019. Contextual bandit learning-based viewport prediction for 360 video. In Proceedings of the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR’19). IEEE, Los Alamitos, CA, 972973.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Jiang X., Chiang Y., Zhao Y., and Ji Y.. 2018. Plato: Learning-based adaptive streaming of 360-degree videos. In Proceedings of the 2018 IEEE 43rd Conference on Local Computer Networks (LCN’18). IEEE, Los Alamitos, CA, 393400. https://doi.org/10.1109/LCN.2018.8638092Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Li C., Zhang W., Liu Y., and Wang Y.. 2019. Very long term field of view prediction for 360-degree video streaming. In Proceedings of the 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR’19). IEEE, Los Alamitos, CA, 297302.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Nasrabadi A. T., Samiei A., Mahzari A., McMahan R. P., Prakash R., Farias M. C. Q., and Carvalho M. M.. 2019. A taxonomy and dataset for 360\(^\circ\) videos. In Proceedings of the 10th ACM Multimedia Systems Conference (MMSys’19). ACM, New York, NY, 273278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Ng A. Y., Jordan M. I., and Weiss Y.. 2001. On spectral clustering: Analysis and an algorithm. In Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS’01). 849–856. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Nguyen A., Yan Z., and Nahrstedt K.. 2018. Your attention is unique: Detecting 360-degree video saliency in head-mounted display for head movement prediction. In Proceedings of the 26th ACM International Conference on Multimedia (MM’18). ACM, New York, NY, 11901198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Petrangeli S., Simon G., and Swaminathan V.. 2018. Trajectory-based viewport prediction for 360-degree virtual reality videos. In Proceedings of the International Conference on Artificial Intelligence and Virtual Reality (AIVR’18). IEEE, Los Alamitos, CA, 157160.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Qian F., Ji L., Han B., and Gopalakrishnan V.. 2016. Optimizing 360 video delivery over cellular networks. In Proceedings of the 5th Workshop on All Things Cellular: Operations, Applications, and Challenges (ATC’16). ACM, New York, NY, 16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Hooft J. Van der, Vega M. Torres, Petrangeli S., Wauters T., and Turck F. De. 2019. Optimizing adaptive tile-based virtual reality video streaming. In Proceedings of the 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM’19). IEEE, Los Alamitos, CA, 381387.Google ScholarGoogle Scholar
  20. [20] Hooft J. Van der, Vega M. Torres, Petrangeli S., Wauters T., and Turck F. De. 2019. Tile-based adaptive streaming for virtual reality video. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 4 (Dec. 2019), Article 110, 24 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Vielhaben J., Camalan H., Samek W., and Wenzel M.. 2019. Viewport forecasting in 360\(^\circ\) virtual reality videos with machine learning. In Proceedings of the 2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR’19). IEEE, Los Alamitos, CA, 740747. https://doi.org/10.1109/AIVR46125.2019.00020Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Wu C., Tan Z., Wang Z., and Yang S.. 2017. A dataset for exploring user behaviors in VR spherical video streaming. In Proceedings of the 8th ACM Conference on Multimedia Systems (MMSys’17). ACM, New York, NY, 193198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Wu C., Zhang R., Wang Z., and Sung L.. 2020. A spherical convolution approach for learning long term viewport prediction in 360 immersive video. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 1400314040.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Xie L., Xu Z., Ban Y., Zhang X., and Guo Z.. 2017. 360ProbDASH: Improving QoE of 360 video streaming using tile-based HTTP adaptive streaming. In Proceedings of the 25th ACM International Conference on Multimedia (MM’17). ACM, New York, NY, 315323. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Xie L., Zhang X., and Guo Z.. 2018. CLS: A cross-user learning based system for improving QoE in 360-degree video adaptive streaming. In Proceedings of the 26th ACM International Conference on Multimedia (MM’18). ACM, New York, NY, 564572. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Xu Y., Dong Y., Wu J., Sun Z., Shi Z., Yu J., and Gao S.. 2018. Gaze prediction in dynamic 360\(^\circ\) immersive videos. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE, Los Alamitos, CA, 53335342.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Zelnik-Manor L. and Perona P.. 2004. Self-tuning spectral clustering. In Proceedings of the 17th Inernational Conference on Neural Information Processing Systems (NIPS’04). 1601–1608. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Machine Learning Based Content-Agnostic Viewport Prediction for 360-Degree Video

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Multimedia Computing, Communications, and Applications
            ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 2
            May 2022
            494 pages
            ISSN:1551-6857
            EISSN:1551-6865
            DOI:10.1145/3505207
            Issue’s Table of Contents

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 16 February 2022
            • Accepted: 1 July 2021
            • Revised: 1 May 2021
            • Received: 1 February 2021
            Published in tomm Volume 18, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Refereed
          • Article Metrics

            • Downloads (Last 12 months)217
            • Downloads (Last 6 weeks)15

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Full Text

          View this article in Full Text.

          View Full Text

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!