skip to main content
research-article
Open Access

Xatu: Richer Neural Network Based Prediction for Video Streaming

Published:15 December 2021Publication History
Skip Abstract Section

Abstract

The performance of Adaptive Bitrate (ABR) algorithms for video streaming depends on accurately predicting the download time of video chunks. Existing prediction approaches (i) assume chunk download times are dominated by network throughput; and (ii) apriori cluster sessions (e.g., based on ISP and CDN) and only learn from sessions in the same cluster. We make three contributions. First, through analysis of data from real-world video streaming sessions, we show (i) apriori clustering prevents learning from related clusters; and (ii) factors such as the Time to First Byte (TTFB) are key components of chunk download times but not easily incorporated into existing prediction approaches. Second, we propose Xatu, a new prediction approach that jointly learns a neural network sequence model with an interpretable automatic session clustering method. Xatu learns clustering rules across all sessions it deems relevant, and models sequences with multiple chunk-dependent features (e.g., TTFB) rather than just throughput. Third, evaluations using the above datasets and emulation experiments show that Xatu significantly improves prediction accuracies by 23.8% relative to CS2P (a state-of-the-art predictor). We show Xatu provides substantial performance benefits when integrated with multiple ABR algorithms including MPC (a well studied ABR algorithm), and FuguABR (a recent algorithm using stochastic control) relative to their default predictors (CS2P and a fully connected neural network respectively). Further, Xatu combined with MPC outperforms Pensieve, an ABR based on deep reinforcement learning.

References

  1. Can I stream Netflix in ultra hd? https://help.netflix.com/en/node/13444.Google ScholarGoogle Scholar
  2. Chrome Remote Interface. https://github.com/cyrus-and/chrome-remoteinterface.Google ScholarGoogle Scholar
  3. Cisco Visual Networking Index: Forecast and Trends, 2017--2022 White Paper. https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white-paper-c11--741490.html.Google ScholarGoogle Scholar
  4. Cisco Visual Networking Index: Forecast and Trends, 2017--2022 White Paper. https://apps.fcc.gov/edocs_public/attachmatch/FCC-18--10A1.pdf.Google ScholarGoogle Scholar
  5. DASH IF Test Assets Database. http://testassets.dashif.org/#testvector/list.Google ScholarGoogle Scholar
  6. DASH Industry Forum: Dash.js. http://dashif.org/reference/players/javascript/1.4.0/samples/dash-if-reference-player/.Google ScholarGoogle Scholar
  7. Fugu Github. https://github.com/StanfordSNR/puffer.Google ScholarGoogle Scholar
  8. Google-Chrome: Chrome DevTools Protocol. https://chromedevtools.github.io/ devtools-protocol/tot/Network/.Google ScholarGoogle Scholar
  9. hmmlearn. https://hmmlearn.readthedocs.io/en/latest/#.Google ScholarGoogle Scholar
  10. New research reveals buffer rage as tech's newest epidemic. https://www.prnewswire.com/news-releases/new-research-reveals-buffer-rage-as-techs-newest-epidemic-300237001.html.Google ScholarGoogle Scholar
  11. Pensieve Github. https://github.com/hongzimao/pensieve.Google ScholarGoogle Scholar
  12. Principal component analysis. https://en.wikipedia.org/wiki/Principal_component_analysis.Google ScholarGoogle Scholar
  13. PyTorch. https://pytorch.org/.Google ScholarGoogle Scholar
  14. Reduce CloudFront Latency "X-Cache: Miss from cloudfront". https://aws.amazon.com/premiumsupport/knowledge-center/cloudfront-latency-xcache/.Google ScholarGoogle Scholar
  15. Understanding cache HIT and MISS headers with shielded services. https://docs.fastly.com/guides/performance-tuning/understanding-cache-hit-and-miss-headers-with-shielded-services.Google ScholarGoogle Scholar
  16. US Alexa Rank. https://www.alexa.com/topsites/countries/US.Google ScholarGoogle Scholar
  17. Using akamai pragma headers to investigate or troubleshoot akamai content delivery. https://community.akamai.com/customers/s/article/Using-Akamai-Pragma-headers-to-investigate-or-troubleshoot-Akamai-content-delivery?language=en_US.Google ScholarGoogle Scholar
  18. Z. Akhtar, Y. S. Nam, R. Govindan, S. Rao, J. Chen, E. Katz-Bassett, B. Ribeiro, J. Zhan, and H. Zhang. Oboe: Auto-tuning video abr algorithms to network conditions. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, SIGCOMM '18, pages 44--58, New York, NY, USA, 2018. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Alt, T. Ballard, R. Steinmetz, H. Koeppl, and A. Rizk. Cba: Contextual quality adaptation for adaptive bitrate video streaming (extended version). arXiv preprint arXiv:1901.05712, 2019.Google ScholarGoogle Scholar
  20. A. Azzouni and G. Pujolle. Neutm: A neural network-based framework for traffic matrix prediction in sdn. In NOMS 2018--2018 IEEE/IFIP Network Operations and Management Symposium, pages 1--5. IEEE, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Bartulovic, J. Jiang, S. Balakrishnan, V. Sekar, and B. Si nopoli. Biases in Data-Driven Networking, and What to Do About Them. In Proceedings of the 16th ACM Workshop on Hot Topics in Networks - HotNets- XVI, pages 192--198, Palo Alto, CA, USA, 2017. ACM Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. Cardwell, Y. Cheng, C. S. Gunn, S. H. Yeganeh, and V. Jacobson. Bbr: Congestion-based congestion control. ACM Queue, 14, pages 20----53, 2016.Google ScholarGoogle Scholar
  23. F. Chiariotti, S. D'Aronco, L. Toni, and P. Frossard. Online learning adaptation strategy for dash clients. In Proceedings of the 7th International Conference on Multimedia Systems, MMSys '16, pages 8:1--8:12, New York, NY, USA, 2016. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K. Cho, B. Van Merriënboer, D. Bahdanau, and Y. Bengio. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.Google ScholarGoogle Scholar
  25. M. Claeys, S. Latré, J. Famaey, T. Wu, W. Van Leekwijck, and F. De Turck. Design of a q-learning-based client quality selection algorithm for http adaptive video streaming. In Proceedings of the 2013 Workshop on Adaptive and Learning Agents (ALA), Saint Paul (Minn.), USA, pages 30--37, 2013.Google ScholarGoogle Scholar
  26. M. Claeys, S. Latré, J. Famaey, T. Wu, W. Van Leekwijck, and F. De Turck. Design and optimisation of a (fa) q-learning-based http adaptive streaming client. Connection Science, 26(1):25--43, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.Google ScholarGoogle Scholar
  28. A. Elgabli, V. Aggarwal, S. Hao, F. Qian, and S. Sen. Lbp: Robust rate adaptation algorithm for svc video streaming. IEEE/ACM Transactions on Networking, 26(4):1633--1645, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Esteban, O. Staeck, S. Baier, Y. Yang, and V. Tresp. Predicting clinical events by combining static and dynamic information using recurrent neural networks. In 2016 IEEE International Conference on Healthcare Informatics (ICHI), pages 93--101. IEEE, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  30. A. Ganjam, F. Siddiqui, J. Zhan, X. Liu, I. Stoica, J. Jiang, V. Sekar, and H. Zhang. C3: Internet-scale control plane for video quality optimization. In 12th Symposium on Networked Systems Design and Implementation NSDI '15), pages 131--144, 2015.Google ScholarGoogle Scholar
  31. E. Ghabashneh and S. Rao. Exploring the interplay between cdn caching and video streaming performance. In 2020 IEEE Conference on Computer Communications (INFOCOM). IEEE, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Ghasemi, P. Kanuparthy, A. Mansy, T. Benson, and J. Rexford. Performance characterization of a commercial video streaming service. IMC '16, New York, NY, USA, 2016. Association for Computing Machinery.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Q. He, C. Dovrolis, and M. Ammar. On the predictability of large transfer tcp throughput. In ACM SIGCOMM Computer Communication Review, volume 35, pages 145--156. ACM, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. T.-C. Hsu, S.-T. Liou, Y.-P. Wang, Y.-S. Huang, et al. Enhanced recurrent neural network for combining static and dynamic features for credit card default prediction. In ICASSP 2019--2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1572--1576. IEEE, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  36. T.-Y. Huang, R. Johari, N. McKeown, M. Trunnell, and M. Watson. A buffer-based approach to rate adaptation: Evidence from a large video streaming service. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM '14, pages 187--198, New York, NY, USA, 2014. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. A. Jabri, A. Joulin, and L. Van Der Maaten. Revisiting visual question answering baselines. In European conference on computer vision, pages 727--739. Springer, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  38. J. Jiang, V. Sekar, and H. Zhang. Improving fairness, efficiency, and stability in http-based adaptive video streaming with festive. In Proceedings of the 8th International Conference on Emerging Networking Experiments and Technologies, CoNEXT '12, pages 97--108, New York, NY, USA, 2012. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. D. H. Lee, C. Dovrolis, and A. C. Begen. Caching in http adaptive streaming: Friend or foe? In Proceedings of Network and Operating System Support on Digital Audio and Video Workshop, page 31. ACM, 2014.Google ScholarGoogle Scholar
  40. A. Leontjeva and I. Kuzovkin. Combining static and dynamic features for multivariate sequence classification. In 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pages 21--30. IEEE, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  41. X. Liu, F. Dobrian, H. Milner, J. Jiang, V. Sekar, I. Stoica, and H. Zhang. A case for a coordinated internet video control plane. ACM SIGCOMM Computer Communication Review, 42(4):359--370, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A neural-based approach to answering questions about images. In Proceedings of the IEEE international conference on computer vision, pages 1--9, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. H. Mao, R. Netravali, and M. Alizadeh. Neural adaptive video streaming with pensieve. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 197--210. ACM, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. H. Mao, S. B. Venkatakrishnan, M. Schwarzkopf, and M. Alizadeh. Variance reduction for reinforcement learning in input-driven environments. In 7th International Conference on Learning Representations ICLR '19, 2019.Google ScholarGoogle Scholar
  45. L. Mei, R. Hu, H. Cao, Y. Liu, Z. Han, F. Li, and J. Li. Realtime mobile bandwidth prediction using lstm neural network. In International Conference on Passive and Active Network Measurement, pages 34--47. Springer, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  46. S. Merity, N. S. Keskar, and R. Socher. Regularizing and optimizing lstm language models. arXiv preprint arXiv:1708.02182, 2017.Google ScholarGoogle Scholar
  47. M. Mirza, J. Sommers, P. Barford, and X. Zhu. A machine learning approach to tcp throughput prediction. In ACM SIGMETRICS Performance Evaluation Review, volume 35, pages 97--108. ACM, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. A. Narayanan, S. Verma, E. Ramadan, P. Babaie, and Z.-L. Zhang. Making content caching policies' smart'using the deepcache framework. ACM SIGCOMM Computer Communication Review, 48(5):64--69, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. R. Netravali, A. Sivaraman, K. Winstein, S. Das, A. Goyal, and H. Balakrishnan. Mahimahi: A lightweight toolkit for reproducible web measurement. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM '14, page 129--130, New York, NY, USA, 2014. Association for Computing Machinery.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. J. Padhye, V. Firoiu, D. Towsley, and J. Kurose. Modeling tcp throughput: A simple model and its empirical validation. ACM SIGCOMM Computer Communication Review, 28(4):303--314, 1998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. S. Puzhavakath Narayanan, Y. S. Nam, A. Sivakumar, B. Chandrasekaran, B. Maggs, and S. Rao. Reducing latency through page-aware management of web objects by content delivery networks. In ACM SIGMETRICS Performance Evaluation Review, volume 44, pages 89--100. ACM, 2016.Google ScholarGoogle Scholar
  52. Y. Qin, R. Jin, S. Hao, K. R. Pattipati, F. Qian, S. Sen, C. Yue, and B. Wang. A control theoretic approach to abr video streaming: A fresh look at pid-based rate adaptation. IEEE Transactions on Mobile Computing, 2019.Google ScholarGoogle Scholar
  53. M. Seo, A. Kembhavi, A. Farhadi, and H. Hajishirzi. Bidirectional attention flow for machine comprehension. ICLR, 2017.Google ScholarGoogle Scholar
  54. K. Spiteri, R. Sitaraman, and D. Sparacio. From theory to practice: Improving bitrate adaptation in the dash reference player. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 15(2s):67, 2019.Google ScholarGoogle Scholar
  55. K. Spiteri, R. Urgaonkar, and R. K. Sitaraman. Bola: Near-optimal bitrate adaptation for online videos. In IEEE INFOCOM 2016-The 35th Annual IEEE International Conference on Computer Communications, pages 1--9. IEEE, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. P. C. Sruthi, S. G. Rao, and B. Ribeiro. Pitfalls of data-driven networking: A case study of latent causal confounders in video streaming. In ACM Sigcomm workshop on Network Meets AI and ML (NetAI), 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Y. Sun, X. Yin, J. Jiang, V. Sekar, F. Lin, N. Wang, T. Liu, and B. Sinopoli. Cs2p: Improving video bitrate selection and adaptation with data-driven throughput prediction. In Proceedings of the 2016 ACM SIGCOMM Conference, pages 272--285, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. M. Tadayon and Y. Iwashita. Comprehensive analysis of time series forecasting using neural networks. arXiv e-prints, pages arXiv--2001, 2020.Google ScholarGoogle Scholar
  59. G. Tian and Y. Liu. Towards agile and smooth video adaptation in dynamic http streaming. In Proceedings of the 8th International Conference on Emerging Networking Experiments and Technologies, CoNEXT '12, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. H. D. Trinh, L. Giupponi, and P. Dini. Mobile traffic prediction from raw data using lstm networks. In 2018 IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), pages 1827--1832. IEEE, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. J. van der Hooft, S. Petrangeli, M. Claeys, J. Famaey, and F. De Turck. A learning-based algorithm for improved bandwidth-awareness of adaptive streaming clients. In 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), pages 131--138. IEEE, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  62. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is All you Need. In NIPS, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. D. Wang and E. Nyberg. A long short-term memory model for answer sentence selection in question answering. In ACL, volume 2, pages 707--712, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  64. Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600--612, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. W. Xiong, L. Wu, F. Alleva, J. Droppo, X. Huang, and A. Stolcke. The microsoft 2017 conversational speech recognition system. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5934--5938. IEEE, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. F. Y. Yan, H. Ayers, C. Zhu, S. Fouladi, J. m. Hong, K. Zhang, P. Levis, and K. Winstein. Learning in situ: a randomized experiment in video streaming. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), pages 495--511, 2020.Google ScholarGoogle Scholar
  67. H. Yeo, Y. Jung, J. Kim, J. Shin, and D. Han. Neural adaptive content-aware internet video delivery. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI) 18), pages 645--661, 2018.Google ScholarGoogle Scholar
  68. X. Yin, A. Jindal, V. Sekar, and B. Sinopoli. A control-theoretic approach for dynamic adaptive video streaming over http. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, SIGCOMM '15, London, United Kingdom, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. C. Zhang and P. Patras. Long-term mobile traffic forecasting using deep spatio-temporal neural networks. In Proceedings of the Eighteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing, pages 231--240. ACM, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Y. Zhang, V. Zhong, D. Chen, G. Angeli, and C. D. Manning. Position-aware attention and supervised data improve slot filling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 35--45, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  71. F. Zhu, X. Song, C. Zhong, S. Fang, R. Bouchard, V. N. Fontama, P. Singh, J. Gao, and L. Deng. Churn prediction using static and dynamic features, Sept. 6 2018. US Patent App. 15/446,870.Google ScholarGoogle Scholar
  72. X. K. Zou, J. Erman, V. Gopalakrishnan, E. Halepovic, R. Jana, X. Jin, J. Rexford, and R. K. Sinha. Can accurate predictions improve video streaming in cellular networks? In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications, pages 57--62. ACM, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Xatu: Richer Neural Network Based Prediction for Video Streaming

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Article Metrics

          • Downloads (Last 12 months)339
          • Downloads (Last 6 weeks)26

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!