Toward Mitigating Misinformation and Social Media Manipulation in LLM Era

The pervasive abuse of misinformation to influence public opinion on social media has become increasingly evident in various domains, encompassing politics, as seen in presidential elections, and healthcare, most notably during the recent COVID-19 pandemic. This threat has grown in severity as the development of Large Language Models (LLMs) empowers manipulators to generate highly convincing deceptive content with greater efficiency. Furthermore, the recent strides in chatbots integrated with LLMs, such as ChatGPT, have enabled the creation of human-like interactive social bots, posing a significant challenge to both human users and the social-bot-detection systems of social media platforms.These challenges motivate researchers to develop algorithms to mitigate misinformation and social media manipulations. This tutorial introduces the advanced machine learning researches that are helpful for this goal, including (1) detection of social manipulators, (2) learning causal models of misinformation and social manipulation, and (3) LLM-generated misinformation detection. In addition, we also present possible future directions.


TOPICS AND MOTIVATIONS
In recent years, the proliferation of misinformation on social media platforms, particularly during significant events like presidential elections and global pandemics, has become a growing concern.For example, during the COVID-19 pandemic, the widespread dissemination of misleading information, which downplayed the seriousness of the epidemic and exaggerated the side effects of COVID-19 vaccines, has had detrimental effects on public health and eroded trust in credible sources [10,24].Moreover, misinformation has frequently been harnessed to manipulate social outcomes and public opinion, severely undermining the credibility of content on social media platforms [12,34].
Researchers have been persistently working on combating misinformation and its manipulation toward social media [2,13,18,32,33,38].The proposed techniques have achieved success to a certain extent when addressing traditional threatens, e.g.social bots incorporated with predefined contents that are fabricated by human editors.However, the rapid advances of AI generated contents (AIGC) have opened the Pandora's box.Incorporated with large language models (LLMs) and in-context learning, the manipulators of misinformation campaigns are bringing more serious threats : • More interactive social bots.By crafting appropriate prompts, manipulators can make their social bots engage with other accounts autonomously.Such interactive social bots are considerably harder to detect than their traditional counterparts, which merely follow pre-defined scripts [3,11,40].

OUTLINES
This proposal includes following sections:

Introduction: Rising New Challenges
This section will first briefly introduce the scientific definition to misinformation and social media manipulation.It will present existing examples of manipulation discovered on social media.Furthermore, this section will discuss about the following new challenges that are brought by the advances of LLMs, including: • Difficulties of Detecting LLM-powered Social Bots • High-quality Deceptive Misinformation Corpus • Pervasion of AI-Generated Multi-Modal Media From this part, audience will establish a basic understanding on the motivation and concepts involved in this tutorial, such as misinformation campaigns.

Misinformation Detection in Large Model Era
The rise of Large Language Models is a double-edged sword for combating misinformation.On one side, LLM has significantly ease the creation of highly deceptive misinformation.For instance, they can simulate the linguistic style of mainstream media to draft fake news and then generate falsified images or retrieve authentic but out-of-context contents to support specific narratives.On the other side, LLM's advanced capacity in tackling text as well as inputs in other modalities (e.g., images, graph and tabular data [16,23,37]) benefits automatic fact verification.This section will introduce recent progresses in:

Manipulator Detection on Social Media
Social-manipulation accounts, such as social bots, are increasingly abused by misinformation campaigns to spread specific narratives and manipulate public opinions [35].The social bots incorporated with the advanced LLMs, such as ChatGPT, can easily behave human-like interactive ability to avoid beding detected by automatic algorithms.This section will introduce solutions to detect social manipulation, including detection based on individual behaviors (e.g.linguistic cues, IP address, following list) and coordinated activities (e.g.account groups with anomalous correlated activities).This section further divides into the following parts:

Road to the Future
We will discuss the potential directions on mitigating misinformation and social media manipulations, such as reinforcementlearning-based clarification recommendation and personalized clarification drafting.

RELEVANCE TO THE COMMUNITY
The rise of social manipulation and misinformation is increasingly harm the credibility of online resources and information.To combat with this threat more efficiently, algorithmic tools incorporated with advanced technologies in artificial intelligence, such as natural language processing, causal inference and anomaly detection, are urgently demanded.This tutorial aims at introducing necessary advanced tools to the researchers in social media analysis.The first part will help fresh researchers establish basic understanding to the targeted problem and some necessary concepts in related machine learning works.The rest sections will introduce the advanced models in more details.

FORMAT AND DETAILED SCHEDULE
This tutorial will be a lecture-style tutorial.The detailed schedule is as followed: • Outline of the Tutorial -Introduction (30 mins) -Misinformation Detection in Large Model Era (55 mins)

SUPPORT MATERIALS
We will share the slides and tutorial recording after the conference.We will also provide a reference to the papers involved in this tutorial.In addition, we have opensourced the code of our works that are involved in the tutorial on Github including our work on coordination detection and causal analysis on misinformation.

RELATED TUTORIALS
This tutorial does not have previous versions.It is the first time that we summarize the advanced machine learning works that will be helpful in addressing the threats of social media manipulation and misinformation in LLM era.Besides our tutorials, other organizers host some related tutorials before The rise of LLM.Zafarani et al host tutorials on WSDM 2019 and SIGKDD 2019 to discuss the advanced techniques of fake news detection [42,47].Nakov et al host tutorials on EMNLP 2020 and WSDM 2023 to disucss the techniques of fact-checking and stance detection [26,27].Fung et al host a tutorial on SIGKDD 2022 to discuss the natural language processing tools for combating misinformation [14].Derczynski introduces the techniques in the whole pipelines involved in developing fake news detector and data collection [8].Giachanou et al host a tutorial on CIKM 2020 to discuss the detection and mitigation of fake news and hate speech [15].Lakshmanan et al introduce misinformation detection and mitigation to the database community in VLDB 2019 [21].The above tutorials mainly focus on addressing fake news content.Compared to our tutorial, they did not get involved in coordination detection and causal inference.Lee et al host a tutorial on WWW 2014 (now the name is changed to TheWebConference) about misinformation campaigns and malicious accounts [22].However, as it is host in 2014, most of the advanced techniques introduced in thie tutorial are not involved.

OTHER DETAILS 7.1 Intended Audience
This tutorial targets on the researchers interested in social media analysis.It introduces the advanced machine learning tools in combating misinformation and social media manipulation.These tools are helpful for analyzing data from social media.The introduction part of this tutorial is for new comers in this area.The audiences are expected to have introductory knowledge about fake news detection and machine learning.In the rest three parts, we will provide necessary explanations to the advanced concepts involved and the audience are not expected to have experiences in these topics.

Previous Offerring
This tutorial does not have previous versions.It is the first time that we summarize the advanced machine learning works that will be helpful in addressing the threats of social media manipulation and misinformation in LLM era.

ACKNOWLEDGEMENT
This work is partially supported by NSF Research Grant IIS-2226087.Views and conclusions are of the authors and should not be interpreted as representing the social policies of the funding agency, or U.S. Government.Yizhou Zhang is also partly supported by the Annenberg Fellowship of the University of Southern California.
Yizhou Zhang is a Ph.D. candidate from the University of Southern California.His research interests mainly focus on machine learning and its application to social networks and social media.He has published papers in top conferences like TheWebConf, NeurIPS, KDD, ICWSM, IJCAI, ICDM, etc. Lun Du is a Researcher in Coupang, Inc.His research interests include Large Language Models, Graph Models and their application in data mining.He has published 50+ papers in top conferences and journals, such as TheWebConf, NeurIPS, ICLR, KDD, ICSE, ACL, WSDM, etc., and his papers won the Best Paper Runner-up Award at CIKM'19 and Best Short Paper Award at CIKM'21.Karishma Sharma is an Applied Scientists in the Amazon.She received her Ph.D. degree from the University of Southern California.Her research interests include machine learning and network analysis.She has published papers in top conferences and journals, such as TheWebConf, NeurIPS, KDD, ICWSM, etc. Yan Liu is a Professor in the Computer Science Department and the Director of Machine Learning Center at University of Southern California.She received her Ph.D. degree from Carnegie Mellon University.Her research interest is machine learning and its applications to social network analysis, health care and sustainability.She has received several awards, including NSF CAREER Award, Okawa Foundation Research Award, New Voices of Academies of Science, Engineering, and Medicine etc.She serves as the program co-chair for KDD 2022, ICLR 2022, SDM 2020, WSDM 2018, and associate program co-chair for AAAI 2021.She has given tutorials at top conferences in machine learning and applications, such as WSDM, ICML, AAAI, IJCAI, KDD, CIKM, and delivered invited talks at various workshops at NeurIPS, ICML, KDD, IJCAI, AAAI etc.