skip to main content
research-article

Machine Learning--Based Parametric Audiovisual Quality Prediction Models for Real-Time Communications

Published:15 March 2017Publication History
Skip Abstract Section

Abstract

In order to mechanically predict audiovisual quality in interactive multimedia services, we have developed machine learning--based no-reference parametric models. We have compared Decision Trees--based ensemble methods, Genetic Programming and Deep Learning models that have one and more hidden layers. We have used the Institut national de la recherche scientifique (INRS) audiovisual quality dataset specifically designed to include ranges of parameters and degradations typically seen in real-time communications. Decision Trees--based ensemble methods have outperformed both Deep Learning-- and Genetic Programming--based models in terms of Root-Mean-Square Error (RMSE) and Pearson correlation values. We have also trained and developed models on various publicly available datasets and have compared our results with those of these original models. Our studies show that Random Forests--based prediction models achieve high accuracy for both the INRS audiovisual quality dataset and other publicly available comparable datasets.

Skip Supplemental Material Section

Supplemental Material

References

  1. Ethem Alpaydin. 2004. Introduction to Machine Learning. The MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. John G. Beerends, Christian Schmidmer, Jens Berger, Matthias Obermann, Raphael Ullmann, Joachim Pomy, and Michael Keyhl. 2013. Perceptual objective listening quality assessment (POLQA), the third generation ITU-T standard for end-to-end speech quality measurement part I—Temporal alignment. J. Audio Eng. Soc. 61, 6 (2013), 366--384.Google ScholarGoogle Scholar
  3. Benjamin Belmudez. 2015. Audiovisual Quality Assessment and Prediction for Videotelephony. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Yoshua Bengio, Ian J. Goodfellow, and Aaron Courville. 2015. Deep learning. (in preparation).Google ScholarGoogle Scholar
  5. Leo Breiman. 2001. Random forests. Mach. Learn. 45, 1 (2001), 5--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Andrew C. Comrie. 1997. Comparing neural networks and regression models for ozone forecasting. J. Air Waste Manag. Assoc. 47, 6 (1997), 653--663.Google ScholarGoogle ScholarCross RefCross Ref
  7. Edip Demirbilek. 2016a. GStreamer Multimedia Quality Testbed. Retrieved from https://github.com/ edipdemirbilek/GStreamerMultimediaQualityTestbed.Google ScholarGoogle Scholar
  8. Edip Demirbilek. 2016b. The INRS Audiovisual Quality Dataset. Retrieved from https://github.com/edipdemirbilek/TheINRSAudiovisualQualityDataset. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Edip Demirbilek. 2016c. Subjective Assesment Video Player. Retrieved from https://github.com/edipdemirbilek/SubjectiveAssesmentVideoPlayer.Google ScholarGoogle Scholar
  10. Edip Demirbilek and Jean-Charles Grégoire. 2016a. INRS audiovisual quality dataset. In ACM Multimedia Conference 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Edip Demirbilek and Jean-Charles Grégoire. 2016b. Multimedia communication quality assessment testbeds. arXiv Preprint arXiv:1609.06612 (2016).Google ScholarGoogle Scholar
  12. Edip Demirbilek and Jean-Charles Grégoire. 2016c. Towards reduced reference parametric models for estimating audiovisual quality in multimedia services. In Proceedings of the International Conference on Communications (ICC’16). IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  13. Pedro Domingos. 2012. A few useful things to know about machine learning. Commun. ACM 55, 10 (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Haiqing Du, Chang Guo, Yixi Liu, and Yong Liu. 2009. Research on relationship between QoE and QoS based on BP neural network. In Proceedings of the International Conference on Network Infrastructure and Digital Content (IC-NIDC’09). IEEE, 312--315.Google ScholarGoogle Scholar
  15. Ran Dubin, Amit Dvir, Ofir Pele, and Ofer Hadar. 2016. Real time video quality representation classification of encrypted HTTP adaptive video streaming-the case of safari. arXiv Preprint arXiv:1602.00489 (2016).Google ScholarGoogle Scholar
  16. Karel Fliegel. 2014. QUALINET multimedia databases v5. 5. (2014).Google ScholarGoogle Scholar
  17. Marie-Neige Garcia. 2014. Parametric Packet-based Audiovisual Quality Model for IPTV Services. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Marie-Neige Garcia, Peter List, Bernhard Feiten, Ulf Wüstenhagen, and Alexander Raake. 2016. Audio-Video databases for H. 264-bitstream-based quality assessment of IPTV services. (2016).Google ScholarGoogle Scholar
  19. Paolo Gastaldo, Rodolfo Zunino, and Judith Redi. 2013. Supporting visual quality assessment with machine learning. EURASIP J. Image Vid. Process. 2013, 1 (2013), 1--15.Google ScholarGoogle ScholarCross RefCross Ref
  20. Mohammad Goudarzi, Lingfen Sun, and Emmanuel Ifeachor. 2010. Audiovisual quality estimation for video calls in wireless applications. In Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM’10). IEEE, 1--5.Google ScholarGoogle ScholarCross RefCross Ref
  21. GStreamer. 2016. GStreamer: open source multimedia framework. (2016).Google ScholarGoogle Scholar
  22. Mare Hassenzahl, Axel Platz, Michael Burmester, and Katrin Lehner. 2000. Hedonic and ergonomic quality aspects determine a software’s appeal. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 201--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Quan Huynh-Thu, Marie-Neige Garcia, Filippo Speranza, Philip Corriveau, and Alexander Raake. 2011. Study of rating scales for subjective quality assessment of high-definition video. IEEE Trans. Broadcast. 57, 1 (2011), 1--14.Google ScholarGoogle ScholarCross RefCross Ref
  24. ITU-T G.107. 2003. ITU-T RECOMMENDATION G.107: The E model, a computational model for use in transmission planning. (2003).Google ScholarGoogle Scholar
  25. ITU-T G.1070. 2012. ITU-T RECOMMENDATION G.1070: Opinion model for video-telephony applications. (2012).Google ScholarGoogle Scholar
  26. ITU-T G.1071. 2015. ITU-T RECOMMENDATION G.1071: Opinion model for network planning of video and audio streaming applications. (2015).Google ScholarGoogle Scholar
  27. ITU-T P.1201. 2012. ITU-T RECOMMENDATION P.1201: Parametric non-intrusive assessment of audiovisual media streaming quality. (2012).Google ScholarGoogle Scholar
  28. ITU-T P.1201.1. 2012. ITU-T RECOMMENDATION P.1201.1: Parametric non-intrusive assessment of audiovisual media streaming quality—lower resolution application area. (2012).Google ScholarGoogle Scholar
  29. ITU-T P.1201.2. 2012. ITU-T RECOMMENDATION P.1201.2: Parametric non-intrusive assessment of audiovisual media streaming quality—higher resolution application area. (2012).Google ScholarGoogle Scholar
  30. ITU-T P.1401. 2012. ITU-T RECOMMENDATION P.1401: Methods, metrics and procedures for statistical evaluation, qualification and comparison of objective quality prediction models. (2012).Google ScholarGoogle Scholar
  31. ITU-T p.910. 1999. ITU-T RECOMMENDATION P.910: Subjective video quality assessment methods for multimedia applications. (1999).Google ScholarGoogle Scholar
  32. ITU-T P.911. 1998. ITU-T RECOMMENDATION P.911: Subjective audiovisual quality assessment methods for multimedia applications. In International Telecommunications Union, Geneva. (1998).Google ScholarGoogle Scholar
  33. ITU-T P.920. 1996. ITU-T Recommendation P.920, Interactive test methods for audiovisual communications. In Proceedings of the International Telecommunications Union Radiocommunication Assembly. (1996).Google ScholarGoogle Scholar
  34. Christian Keimel, Arne Redl, and Klaus Diepold. 2012. The TUM high definition video datasets. In Proceedings of the 2012 4th International Workshop on Quality of Multimedia Experience (QoMEX). IEEE, 97--102.Google ScholarGoogle ScholarCross RefCross Ref
  35. Baris Konuk, Emin Zerman, Gozde Bozdagi Akar, and Gokce Nur. 2015. Content aware audiovisual quality assessment. In Proceedings of the 2015 23rd Signal Processing and Communications Applications Conference (SIU). IEEE, 966--969.Google ScholarGoogle ScholarCross RefCross Ref
  36. Andy Liaw and Matthew Wiener. 2002. Classification and regression by randomForest. R News 2, 3 (2002).Google ScholarGoogle ScholarCross RefCross Ref
  37. Jérôme Martinez. 2016. MediaInfo v0.7.74. (2016). Retrieved from https://mediaarea.net/nn/MediaInfo/Download.Google ScholarGoogle Scholar
  38. Toni Mäki, Dragan Kukolj, Dragana Dordević, and Martín Varela. 2013. A reduced-reference parametric model for audiovisual quality of IPTV services. In Proceedings of the 2013 5th International Workshop on Quality of Multimedia Experience (QoMEX). IEEE, 6--11.Google ScholarGoogle ScholarCross RefCross Ref
  39. M. Sajid Mushtaq, Brice Augustin, and Abdelhamid Mellouk. 2012. Empirical study based on machine learning approach to assess the QoS/QoE correlation. In Proceedings of the 2012 17th European Conference on Networks and Optical Communications (NOC). IEEE, 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  40. Nikunj C. Oza. 2005. Online bagging and boosting. In Proceedings of the 2005 IEEE International Conference on Systems, Man and Cybernetics, Vol. 3. IEEE, 2340--2345.Google ScholarGoogle ScholarCross RefCross Ref
  41. Bernhard Pfahringer, Geoffrey Holmes, and Richard Kirkby. 2007. New options for hoeffding trees. In AI 2007: Advances in Artificial Intelligence. Springer, 90--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Margaret Pinson. 2013. The consumer digital video library {best of the web}. IEEE Sign. Process. Mag. 30, 4 (2013), 172--174.Google ScholarGoogle ScholarCross RefCross Ref
  43. Margaret Pinson, Lucjan Janowski, Romuald Pépion, Quan Huynh-Thu, Christian Schmidmer, Phillip Corriveau, Audrey Younkin, Patrick Le Callet, Marcus Barkowsky, and William Ingram. 2012. The influence of subjects and environment on audiovisual subjective tests: An international study. IEEE J. Select. Top. Sign. Process. 6, 6 (2012), 640--651.Google ScholarGoogle ScholarCross RefCross Ref
  44. Margaret Pinson, Christian Schmidmer, Lucjan Janowski, Romuald Pépion, Quan Huynh-Thu, Philip Corriveau, Audrey Younkin, Patrick Le Callet, Marcus Barkowsky, and William Ingram. 2013. Subjective and objective evaluation of an audiovisual subjective dataset for research and development. In Proceedings of the 2013 5th International Workshop on Quality of Multimedia Experience (QoMEX).Google ScholarGoogle ScholarCross RefCross Ref
  45. Riccardo Poli, William B. Langdon, Nicholas F. McPhee, and John R. Koza. 2008. A Field Guide to Genetic Programming. Lulu.com. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Alexander Raake, Jörgen Gustafsson, Savvas Argyropoulos, Marie-Neige Garcia, David Lindegren, Gunnar Heikkilä, Martin Pettersson, Peter List, and Bernhard Feiten. 2011. IP-based mobile and fixed network audiovisual media services. IEEE Sign. Process. Mag. 28, 6 (2011), 68--79.Google ScholarGoogle ScholarCross RefCross Ref
  47. Werner Robitza, Yohann Pitrey, Matej Nezveda, Shelley Buchinger, and Helmut Hlavacs. 2012. Made for mobile: A video database designed for mobile television. In Proceedings of the 6th International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM).Google ScholarGoogle Scholar
  48. Michael Schmidt and Hod Lipson. 2010. Symbolic regression of implicit equations. In Genetic Programming Theory and Practice VII. Springer, 73--85.Google ScholarGoogle Scholar
  49. Carolin Strobl, Anne-Laure Boulesteix, Achim Zeileis, and Torsten Hothorn. 2007. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinform. 8, 1 (2007), 1.Google ScholarGoogle Scholar
  50. Junyong You, Ulrich Reiter, Miska M. Hannuksela, Moncef Gabbouj, and Andrew Perkis. 2010. Perceptual-based quality assessment for audio--visual services: A survey. Sign. Process.: Image Commun. 25, 7 (2010), 482--501. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Machine Learning--Based Parametric Audiovisual Quality Prediction Models for Real-Time Communications

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!