skip to main content
research-article

Auditing YouTube’s Recommendation Algorithm for Misinformation Filter Bubbles

Published: 27 January 2023 Publication History

Abstract

In this article, we present results of an auditing study performed over YouTube aimed at investigating how fast a user can get into a misinformation filter bubble, but also what it takes to “burst the bubble,” i.e., revert the bubble enclosure. We employ a sock puppet audit methodology, in which pre-programmed agents (acting as YouTube users) delve into misinformation filter bubbles by watching misinformation-promoting content. Then they try to burst the bubbles and reach more balanced recommendations by watching misinformation-debunking content. We record search results, home page results, and recommendations for the watched videos. Overall, we recorded 17,405 unique videos, out of which we manually annotated 2,914 for the presence of misinformation. The labeled data was used to train a machine learning model classifying videos into three classes (promoting, debunking, neutral) with the accuracy of 0.82. We use the trained model to classify the remaining videos that would not be feasible to annotate manually.
Using both the manually and automatically annotated data, we observe the misinformation bubble dynamics for a range of audited topics. Our key finding is that even though filter bubbles do not appear in some situations, when they do, it is possible to burst them by watching misinformation-debunking content (albeit it manifests differently from topic to topic). We also observe a sudden decrease of misinformation filter bubble effect when misinformation-debunking videos are watched after misinformation-promoting videos, suggesting a strong contextuality of recommendations. Finally, when comparing our results with a previous similar study, we do not observe significant improvements in the overall quantity of recommended misinformation content.

References

[1]
Deena Abul-Fottouh, Melodie Yunju Song, and Anatoliy Gruzd. 2020. Examining algorithmic biases in YouTube’s recommendations of vaccine videos. Int. J. Med. Inform. 140 (2020), 104175. DOI:
[2]
Guy Aridor, Duarte Goncalves, and Shan Sikdar. 2020. Deconstructing the filter bubble: User decision-making and recommender systems. In 14th ACM Conference on Recommender Systems. Association for Computing Machinery, Inc, 82–91. DOI:
[3]
Eytan Bakshy, Solomon Messing, and Lada A. Adamic. 2015. Exposure to ideologically diverse news and opinion on Facebook. Science 348, 6239 (2015), 1130–1132.
[4]
Cameron Ballard, Ian Goldstein, Pulak Mehta, Genesis Smothers, Kejsi Take, Victoria Zhong, Rachel Greenstadt, Tobias Lauinger, and Damon McCoy. 2022. Conspiracy brokers: Understanding the monetization of YouTube conspiracy theories. In ACM Web Conference (WWW’22). Association for Computing Machinery, New York, NY, 2707–2718. DOI:
[5]
Jack Bandy. 2021. Problematic machine behavior: A systematic literature review of algorithm audits. Proc. ACM Hum.-comput. Interact. 5, CSCW1 (Apr.2021), 1–34. DOI:
[6]
Alessandro Bessi. 2016. Personality traits and echo chambers on Facebook. Comput. Hum. Behav. 65 (2016), 319–324.
[7]
Frederik J. Zuiderveen Borgesius, Damian Trilling, Judith Möller, Balázs Bodó, Claes H. de Vreese, and Natali Helberger. 2016. Should we worry about filter bubbles? Internet Polic. Rev. (Mar.2016).
[8]
Laura Burbach, Patrick Halbach, Martina Ziefle, and André Calero Valdez. 2019. Bubble trouble: Strategies against filter bubbles in online social networks. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 11582 LNCS. Springer Verlag, 441–456. DOI:
[9]
Michał Choraś, Konstantinos Demestichas, Agata Giełczyk, Álvaro Herrero, Paweł Ksieniewicz, Konstantina Remoundou, Daniel Urda, and Michał Woźniak. 2021. Advanced machine learning techniques for fake news (online disinformation) detection: A systematic mapping study. Appl. Soft Comput. 101 (Mar.2021), 107050. DOI:
[10]
Antrea Chrysanthou, Pinar Barlas, Kyriakos Kyriakou, Styliani Kleanthous, and Jahna Otterbacher. 2020. Bursting the bubble: Tool for awareness and research about overpersonalization in information access systems. International Conference on Intelligent User Interfaces. 112–113. DOI:
[11]
Coalition to Fight Digital Deception. 2021. Trained for Deception: How Artificial Intelligence Fuels Online Disinformation—A Report from the Coalition to Fight Digital Deception. Technical Report, The Anti-Defamation League, Avaaz, Decode Democracy, Mozilla and New America’s Open Technology Institute. Retrieved from https://foundation.mozilla.org/en/campaigns/trained-for-deception-how-artificial-intelligence-fuels-online-disinformation/.
[12]
European Commission, Content Directorate-General for Communications Networks, and Technology. 2020. The Assessment List for Trustworthy Artificial Intelligence (ALTAI) for Self Assessment. Publications Office. DOI:
[13]
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for YouTube recommendations. In 10th ACM Conference on Recommender Systems (RecSys’16). ACM, New York, NY, 191–198. DOI:
[14]
Michela Del Vicario, Alessandro Bessi, Fabiana Zollo, Fabio Petroni, Antonio Scala, Guido Caldarelli, H. Eugene Stanley, and Walter Quattrociocchi. 2016. The spreading of misinformation online. Proc. Nat. Acad. Sci. 113, 3 (2016), 554–559.
[15]
ERGA. 2020. ERGA Report on Disinformation: Assessment of the Implementation of the Code of Practice. Technical Report, European Regulators Group for Audiovisual Media Services (ERGA), 1–73 pages. Retrieved from https://erga-online.eu/wp-content/uploads/2020/05/ERGA-2019-report-published-2020-LQ.pdf.
[16]
European Commission. 2018. EU Code of Practice on Disinformation. Retrieved from https://digital-strategy.ec.europa.eu/en/policies/code-practice-disinformation.
[17]
European Commission. 2018. EU Tackling online disinformation: a European Approach. Retrieved from https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52018DC0236&from=EN.
[18]
European Commission. 2018. A Multi-dimensional Approach to Disinformation: Final Report of the High Level Expert Group on Fake News and Online Disinformation. DOI:
[19]
European Commission. 2020. Proposal for a Regulation of the European Parliament and of the Council on a Single Market for Digital Services (Digital Services Act) and Amending Directive 2000/31/EC. https://eur-lex.europa.eu/legal-content/en/ALL/?uri=COM:2020:825:FIN.
[20]
Miriam Fernández, Alejandro Bellogín, and Iván Cantador. 2021. Analysing the effect of recommendation algorithms on the amplification of misinformation. arXiv:2103.14748 [cs] (March2021).
[21]
Keach Hagey and Horwitz Jeff. 2021. Facebook tried to make its platform a healthier place. It got angrier instead. Wall Street Journal (Sept.2021). Retrieved from https://www.wsj.com/articles/facebook-algorithm-change-zuckerberg-11631654215.
[22]
Aniko Hannak, Piotr Sapiezynski, Arash Molavi Kakhki, Balachander Krishnamurthy, David Lazer, Alan Mislove, and Christo Wilson. 2013. Measuring personalization of web search. In 22nd International Conference on World Wide Web (WWW’13). ACM, New York, NY, 527–538. DOI:
[23]
Muhammad Haroon, Anshuman Chhabra, Xin Liu, Prasant Mohapatra, Zubair Shafiq, and Magdalena Wojcieszak. 2022. YouTube, the great radicalizer? Auditing and mitigating ideological biases in YouTube recommendations. arXiv:2203.10666 [cs] (Mar.2022).
[24]
Jeff Horwitz. 2021. The Facebook files: A Wall Street Journal investigation. Wall Street Journal (Online) (2021). https://www.wsj.com/articles/the-facebook-files-11631713039.
[25]
Rui Hou, Veronica Perez-Rosas, Stacy Loeb, and Rada Mihalcea. 2019. Towards automatic detection of misinformation in online medical videos. In International Conference on Multimodal Interaction (ICMI’19). Association for Computing Machinery, New York, NY, 235–243. DOI:
[26]
Eslam Hussein, Prerna Juneja, and Tanushree Mitra. 2020. Measuring misinformation in video search platforms: An audit study on YouTube. Proc. ACM Hum.-comput. Interact. 4, CSCW1 (May2020). DOI:
[27]
Ray Jiang, Silvia Chiappa, Tor Lattimore, András György, and Pushmeet Kohli. 2019. Degenerate feedback loops in recommender systems. AAAI/ACM Conference on AI, Ethics, and Society. 383–390. DOI:
[28]
Prerna Juneja and Tanushree Mitra. 2021. Auditing e-commerce platforms for algorithmically curated vaccine misinformation. In CHI Conference on Human Factors in Computing Systems (CHI’21). DOI:
[29]
Alistair Knott, Kate Hannah, Dino Pedreschi, Andrew Trotman, Ricardo Baeza-Yates, Rituparna Roy, David Eyers, Virginia Morini, and Valentina Pansanella. 2021. Responsible AI for Social Media Governance—A Proposed Collaborative Method for Studying the Effects of Social Media Recommender Systems on Users. Technical Report. GPAI - The Global Partnership on Artificial Intelligence. Retrieved from https://gpai.ai/projects/responsible-ai/social-media-governance/responsible-ai-for-social-media-governance.pdf.
[30]
Huyen Le, Andrew High, Raven Maragh, Timothy Havens, Brian Ekdale, and Zubair Shafiq. 2019. Measuring political personalization of Google news search. In World Wide Web Conference (WWW’19). 2957–2963. DOI:
[31]
Ping Liu, Karthik Shivaram, Aron Culotta, Matthew A. Shapiro, and Mustafa Bilgic. 2021. The interaction between political typology and filter bubbles in news recommendation algorithms. In Web Conference 2021 (WWW’21). Association for Computing Machinery, New York, NY, 3791–3801. DOI:
[32]
Matus Mesarcik, Robert Moro, Michal Kompan, Juraj Podrouzek, Jakub Simko, and Maria Bielikova. 2022. Analysis of Selected Regulations Proposed by the European Commission and Technological Solutions in Relation to the Dissemination of Disinformation and the Behaviour of Online Platforms. Technical Report. Kempelen Institute of Intelligent Technologies, Bratislava. Retrieved from https://kinit.sk/publication/dissemination-of-disinformation-and-the-behaviour-of-online-platforms/.
[33]
Danaë Metaxa, Joon Sung Park, James A. Landay, and Jeff Hancock. 2019. Search media and elections: A longitudinal investigation of political search results. Proc. ACM Hum.-comput. Interact. 3, CSCW (Nov.2019). DOI:
[34]
Lien Michiels, Jens Leysen, Annelien Smets, and Bart Goethals. 2022. What are filter bubbles really? A review of the conceptual and empirical work. In 30th ACM Conference on User Modeling, Adaptation and Personalization (UMAP’22 Adjunct). Association for Computing Machinery, New York, NY, 274–279. DOI:
[35]
Diana C. Mutz and Lori Young. 2011. Communication and public opinion: Plus ça change? Pub. Opin. Quart. 75, 5 (2011), 1018–1044.
[36]
Richard Owen, Phil Macnaghten, and Jack Stilgoe. 2012. Responsible research and innovation: From science in society to science for society, with society. Sci. Pub. Polic. 39, 6 (2012), 751–760. DOI:
[37]
Kostantinos Papadamou, Savvas Zannettou, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, and Michael Sirivianos. 2022. “It is just a flu”: Assessing the effect of watch history on YouTube’s pseudoscientific video recommendations. Int. AAAI Conf. Web Soc. Media 16, 1 (May2022), 723–734. Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/19329.
[38]
Eli Pariser. 2011. The Filter Bubble: What the Internet is Hiding from You. Penguin UK.
[39]
Branislav Pecher, Ivan Srba, Robert Moro, Matus Tomlein, and Maria Bielikova. 2021. FireAnt: Claim-based medical misinformation detection and monitoring. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD’20). 555–559. DOI:
[40]
Manoel Horta Ribeiro, Raphael Ottoni, Robert West, Virgílio A. F. Almeida, and Wagner Meira. 2020. Auditing radicalization pathways on YouTube. In Conference on Fairness, Accountability, and Transparency. ACM, New York, NY, 131–141. DOI:
[41]
Ronald E. Robertson, David Lazer, and Christo Wilson. 2018. Auditing the personalization and composition of politically-related search engine results pages. World Wide Web Conference (WWW’18), 955–965. DOI:
[42]
Christian Sandvig, Kevin Hamilton, Karrie Karahalios, and Cedric Langbort. 2014. Auditing algorithms: Research methods for detecting discrimination on internet platforms. Data Discrim.: Convert. Crit. Concerns Product. Inq. 22 (2014), 4349–4357.
[43]
Hong Shen, Alicia Devos, Motahhare Eslami, and Kenneth Holstein. 2021. Everyday algorithm auditing: Understanding the power of everyday users in surfacing harmful algorithmic behaviors. Proc. ACM Hum.-comput. Interact. 5, CSCW2 (Oct.2021), 29. DOI:
[44]
Márcio Silva, Lucas Santos de Oliveira, Athanasios Andreou, Pedro Olmo Vaz de Melo, Oana Goga, and Fabricio Benevenuto. 2020. Facebook ads monitor: An independent auditing system for political ads on Facebook. In the Web Conference (WWW’20). ACM, New York, NY, 224–234. DOI:
[45]
Jakub Simko, Patrik Racsko, Matus Tomlein, Martina Hanakova, Robert Moro, and Maria Bielikova. 2021. A study of fake news reading and annotating in social media context. New Rev. Hypermed. Multimed. (2021), 1–31.
[46]
Jakub Simko, Matus Tomlein, Branislav Pecher, Robert Moro, Ivan Srba, Elena Stefancova, Andrea Hrckova, Michal Kompan, Juraj Podrouzek, and Maria Bielikova. 2021. Towards continuous automatic audits of social media adaptive behavior and its role in misinformation spreading. In 29th ACM Conference on User Modeling, Adaptation and Personalization (UMAP’21). ACM, New York, NY, 411–414. DOI:
[47]
Larissa Spinelli and Mark Crovella. 2020. How YouTube leads privacy-seeking users away from reliable information. In 28th ACM Conference on User Modeling, Adaptation and Personalization. ACM, New York, NY, 244–251. DOI:
[48]
Ivan Srba, Robert Moro, Daniela Chuda, Maria Bielikova, Jakub Simko, Jakub Sevcech, Daniela Chuda, Pavol Navrat, and Maria Bielikova. 2019. Monant: Universal and extensible platform for monitoring, detection and mitigation of antisocial behavior. In Workshop on Reducing Online Misinformation Exposure (ROME’19). 1–7.
[49]
Cass R. Sunstein. 1999. The law of group polarization. Univ. Chicago Law School, John M. Olin Law Econ. Work. Paper91 (1999).
[50]
Matus Tomlein, Branislav Pecher, Jakub Simko, Ivan Srba, Robert Moro, Elena Stefancova, Michal Kompan, Andrea Hrckova, Juraj Podrouzek, and Maria Bielikova. 2021. An audit of misinformation filter bubbles on YouTube: Bubble bursting and recent behavior changes. In 15th ACM Conference on Recommender Systems. ACM, New York, NY, 1–11. DOI:
[51]
Pernille Tranberg, Gry Hasselbalch, Catrine S. Byrne, and Birgitte K. Olsen. 2020. DATAETHICS - Principles and Guidelines for Companies, Authorities & Organisations. Dataethics.eu. Retrieved from https://spintype.com/book/dataethics-principles-and-guidelines-for-companies-authorities-organisations.
[52]
Siva Vaidhyanathan. 2018. Antisocial Media: How Facebook Disconnects us and Undermines Democracy. Oxford University Press.
[53]
YouTube. 2020. Managing harmful conspiracy theories on YouTube. Retrieved from https://blog.youtube/news-and-events/harmful-conspiracy-theories-youtube/.
[54]
YouTube. 2021. Perspective: Tackling Misinformation on YouTube. Retrieved from https://blog.youtube/inside-youtube/tackling-misinfo/.
[55]
Savvas Zannettou, Michael Sirivianos, Jeremy Blackburn, and Nicolas Kourtellis. 2019. The web of false information: Rumors, fake news, hoaxes, clickbait, and various other shenanigans. J. Data Inf. Qual. 11, 3 (2019), 1–37.
[56]
Zhe Zhao, Lichan Hong, Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Kumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, and Ed Chi. 2019. Recommending what video to watch next: A multitask ranking system. In 13th ACM Conference on Recommender Systems (RecSys’19). ACM, 43–51. DOI:
[57]
Xinyi Zhou and Reza Zafarani. 2020. A survey of fake news: Fundamental theories, detection methods, and opportunities. Comput. Surv. 53, 5 (Dec.2020). DOI:
[58]
Fabiana Zollo, Petra Kralj Novak, Michela Del Vicario, Alessandro Bessi, Igor Mozetic, Antonio Scala, Guido Caldarelli, and Walter Quattrociocchi. 2015. Emotional dynamics in the age of misinformation. Retrieved from http://dblp.uni-trier.de/db/journals/corr/corr1505.html#ZolloNVBMSCQ15.
[59]
Shoshana Zuboff. 2019. The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. Profile Books.

Cited By

View all
  • (2024)YouTube and Conspiracy Theories: A Longitudinal Audit of Information PanelsProceedings of the 35th ACM Conference on Hypertext and Social Media10.1145/3648188.3675128(273-284)Online publication date: 10-Sep-2024
  • (2024)Stemming the Tide of Problematic Information in Online Environments: Assessing Interventions and Identifying Opportunities for InterruptionCompanion Publication of the 16th ACM Web Science Conference10.1145/3630744.3658615(37-41)Online publication date: 21-May-2024
  • (2024)“I Searched for a Religious Song in Amharic and Got Sexual Content Instead’’: Investigating Online Harm in Low-Resourced Languages on YouTube.Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658546(141-160)Online publication date: 3-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Recommender Systems
ACM Transactions on Recommender Systems  Volume 1, Issue 1
March 2023
163 pages
EISSN:2770-6699
DOI:10.1145/3581755
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 January 2023
Online AM: 26 October 2022
Accepted: 05 October 2022
Revised: 22 August 2022
Received: 31 March 2022
Published in TORS Volume 1, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Audit
  2. recommender systems
  3. filter bubble
  4. misinformation
  5. personalization
  6. automatic labeling
  7. ethics
  8. YouTube

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • Ministry of Education, Science, Research and Sport of the Slovak Republic
  • Central European Digital Media Observatory (CEDMO)
  • European Union
  • EU Horizon 2020
  • Horizon Europe, GA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,121
  • Downloads (Last 6 weeks)96
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)YouTube and Conspiracy Theories: A Longitudinal Audit of Information PanelsProceedings of the 35th ACM Conference on Hypertext and Social Media10.1145/3648188.3675128(273-284)Online publication date: 10-Sep-2024
  • (2024)Stemming the Tide of Problematic Information in Online Environments: Assessing Interventions and Identifying Opportunities for InterruptionCompanion Publication of the 16th ACM Web Science Conference10.1145/3630744.3658615(37-41)Online publication date: 21-May-2024
  • (2024)“I Searched for a Religious Song in Amharic and Got Sexual Content Instead’’: Investigating Online Harm in Low-Resourced Languages on YouTube.Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658546(141-160)Online publication date: 3-Jun-2024
  • (2024)8–10% of algorithmic recommendations are ‘bad’, but… an exploratory risk-utility meta-analysis and its regulatory implicationsInternational Journal of Information Management: The Journal for Information Professionals10.1016/j.ijinfomgt.2023.10274375:COnline publication date: 1-Apr-2024
  • (2023)Perception of Personalization Processes: Awareness, Data Sharing and TransparencyProceedings of the 2nd International Conference of the ACM Greek SIGCHI Chapter10.1145/3609987.3610026(1-5)Online publication date: 27-Sep-2023
  • (2023)Cyclops: Looking beyond the single perspective in information access systemsAdjunct Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization10.1145/3563359.3597398(34-37)Online publication date: 26-Jun-2023

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media