Attacking Fake News Detectors via Manipulating News Social Engagement

Social media is one of the main sources for news consumption, especially among the younger generation. With the increasing popularity of news consumption on various social media platforms, there has been a surge of misinformation which includes false information or unfounded claims. As various text- and social context-based fake news detectors are proposed to detect misinformation on social media, recent works start to focus on the vulnerabilities of fake news detectors. In this paper, we present the first adversarial attack framework against Graph Neural Network (GNN)-based fake news detectors to probe their robustness. Specifically, we leverage a multi-agent reinforcement learning (MARL) framework to simulate the adversarial behavior of fraudsters on social media. Research has shown that in real-world settings, fraudsters coordinate with each other to share different news in order to evade the detection of fake news detectors. Therefore, we modeled our MARL framework as a Markov Game with bot, cyborg, and crowd worker agents, which have their own distinctive cost, budget, and influence. We then use deep Q-learning to search for the optimal policy that maximizes the rewards. Extensive experimental results on two real-world fake news propagation datasets demonstrate that our proposed framework can effectively sabotage the GNN-based fake news detector performance. We hope this paper can provide insights for future research on fake news detection.


INTRODUCTION
With the burgeoning of social media, inaccurate or unfounded information (i.e., misinformation) is also circulating on social media, which demotes people's belief in truth and science [6,40].Unlike traditional news articles distributed by news outlets via their media, social engagements like comments and sharing expedite the spread of misinformation and exaggerate its influence at scale.Recent research has pointed out that misinformation has been hindering the promotion of vaccines and threatening public health during the COVID-19 global pandemic [26].
To combat massive misinformation on social media, many machine learning based misinformation detectors are proposed [48].Besides the methods utilizing natural language processing techniques to check the news content and its writing style to verify its veracity [21,36,46], recent works have begun to leverage news social engagement using graph models for fact-checking [3,27,30,35].Compared to the straightforward NLP-based methods, socialengagement-based methods regard engaged users as an integral part of news posts.Based on the theory and evidence that news consumers have preferences on news content (i.e., the echo chamber) [10,16,27,29,39], the engagement patterns of misinformation and fact are also different.Moreover, the prevalent bots and fraudsters engaged with fake news posts also differentiate their engagement patterns from regular ones [37].
Despite the rapid development of automatic fact-checking, most fake news detectors are static models vulnerable to adversarial attacks.Similar to many security problems, we must acknowledge that misinformation detection is an armed race between content moderators and malicious actors aiming at manipulating public opinion or gaining money through incited social engagement.Therefore, it is imperative to enhance the robustness of misinformation detectors.Though some recent works have investigated the robustness of NLP-based misinformation detectors [1,18,19,24,49], no work has probed the robustness of social-engagement-based misinformation detectors.[24] and [28] are two closest works to ours.However, they either do not consider social engagementbased detectors or do not model the diverse fraudster type in the misinformation campaign.
We use Figure 1 to demonstrate the vulnerability of social engagement based misinformation detectors.Many existing works [30,35] model news social engagement on social media as a heterogeneous graph where users and news posts are nodes, and an edge means a user has shared the post.Graph Neural Networks (GNNs) [15,22,45] have been widely leveraged to encode the above social engagement graph and predict the veracity of news posts.Many GNNs are designed to encode the neighboring node information to enhance the prediction performance of the target node.To exploit this property, as shown in Figure 1, the fraudster who has shared many real news can flip the GNN-based misinformation detector's prediction on a target fake news by sharing it.Because the newly added real news neighbors will alleviate the suspiciousness of the target fake news.
To analyze the robustness of social-engagement-based misinformation detectors, inspired by GNN robustness research [42], we propose to attack GNN-based misinformation detectors by simulating the adversarial behaviors of fraudsters.However, the real-world misinformation campaign delivers three non-trivial challenges for attack simulation: (1) To evade detection while promoting fake news on social media, malicious actors can only manipulate the controlled user accounts to share different social posts.However, most of the previous GNN adversarial attack works assume all nodes and edges can be perturbed, which is impractical.(2) Many deployed GNN-based fake news detectors are grey-box models with various model architectures tailored to the heterogeneous user-post graph.Thus, the gradient-based optimization method used by previous works [50] cannot be utilized to devise an attack.(3) Real-world evidence [31,44] shows that various coordinated malicious actors have engaged in the misinformation campaign.Different types of malicious actors have different capabilities, budgets, and risk appetites.For instance, key opinion leaders have stronger influence than social bots but cost more to cultivate.
To overcome the above challenges, we devise a dedicated Multiagent Reinforcement Learning (MARL) framework, while none of the previous GNN robustness work was used.Specifically, to simulate the real-world behavior of fraudsters who share different posts, we harness a deep reinforcement learning framework to flip the classification result of a target news node by modifying the connections of users who shared the post.We model the MARL framework as a Markov Game where the agents work coordinately to flip the classification result.Overall, our contributions are: • To the best of our knowledge, we are the first work to probe the robustness of GNN-based fake news detectors from a social engagement perspective.Although there have been previous works on attacking fake news detectors using NLP methods, attacking fake news detectors by manipulating the social engagement of news targets has not been studied.
• We leverage a MARL framework to perform targeted attacks on GNN-based fake news detectors to simulate real-world misinformation campaigns.Specifically, we modeled fraudsters as agents with different costs, budgets, and influences in our framework.• Our experiment results show that our proposed MARL framework could effectively flip the GNN prediction results.We discuss the vulnerabilities of GNN-based fake news detectors and provide insights on attack strategies and countermeasures.
The rest of the paper is organized as follows.In Section 2, we introduce related work.In Section 3 and 4, we introduce the problem definition and proposed framework.In Section 5, we report our experiment results and analysis.Finally, we discuss the limitation and future work of this paper in Section 6.

RELATED WORK.
In this section, we review the related work on (1) graph neural network-based fake news detection; (2) adversarial attack on graph neural networks; and (3) adversarial attack on fake news detection.

GNN-based Fake News Detection
We can categorize the existing GNN-based misinformation detection works into two major categories according to their graph prototypes: 1) Propagation-based work [17,27,29,41]: these works model the sharing sequence of a news post as a tree-structured propagation graph with the news post as the root node and edges representing shared relations between users.It can be formulated as either a propagation graph classification or a root node classification task.The propagation graph is infeasible for adversarial attacks because the attacker needs to employ a lot of users to share the target post to flip its classification results.At the same time, such operations are naive for optimization and easily captured by simple outlier measurements.2) Social-context-based work [5,30,35,47]: all users and their shared news posts (e.g., tweets) formulate a bipartite graph (as shown in Figure 1) where an edge means a user shared the post and the objective is training a GNN to classify the news post nodes.Note that previous works usually add the publisher as the third type of node connecting to social posts.In this paper, we only consider the common-used graph prototype (i.e., user-post bipartite graph) as it is easier to manipulate in practice.

Adversarial Attack on GNNs
As GNNs attain excellent performance on many graph mining tasks, their robustness against adversarial attacks has drawn increased attention in recent years [42].RL-S2V [8] and Nettack [50] are two early GNN attacking algorithms aiming at lowering the GNNs' node classification performance via add/deleting edges or modifying node features under a given budget.Following these work, other works have begun to investigate the GNN robustness under different tasks, e.g., link prediction [4], knowledge graph embedding [34], and community detection [25].However, none of the previous works have attempted to attack GNN-based fake news detectors, which have recently become popular amid massive adversaries engaging in fake news spread [37].Compared to the previous works using reinforcement learning to attack GNNs, our work utilizes a multi-agent setting to mimic the real-world misinformation campaign.In addition, to simulate the real-world attack setting, we only manipulate the edges of the news social engagement graph since it is unlikely that attackers can modify news posts.

Adversarial Attack on Fake News Detectors
Given a wide array of machine learning-based fake news detectors, only a few works have investigated the robustness or vulnerabilities of fake news detectors [1,9,18,19,23,24,49].Among those works, [19] examines the robustness of text-based news veracity classifiers over time and against attacks crafted by manipulating news sources.[1,23,49] probe the robustness of NLP-based fake news detectors by devising various attacks that distort the news content or inject adversarial texts.Nash-Detect [9] and AdRumor-RL [28] study the robustness of graph-based spam detectors and rumor detectors respectively, using the reinforcement learning framework.MAL-COM [24] carries out the attacks from another perspective which modifies the comments of each piece of news to fool the fake news detectors leveraging multi-source data.PETGEN [18] simulates the behavior of malicious users on social media by generating a sequence of texts to attack sequence-based misinformation detectors.Unlike previous work, we are the first to explore the robustness of social context-based fake news detectors using a multi-agent reinforcement learning framework.

PROBLEM FORMULATION
We formulate the problem of attacking social-engagement-based fake news detectors as attacking GNNs on a user-post sharing graph.In this section, we first define GNN-based fake news detection and then introduce our adversarial attack objective.

GNN-based Fake News Detection
A user-post sharing graph is defined as a bipartite graph G = { ,  , , X  , X  ,  }, where  = ( 0 , • • • ,   ) is a set of users,  = ( 0 , • • • ,   ) is a set of news posts, and the edge    = (  ,   ) ∈  indicates the user   has shared the news post   .X  and X  are two feature matrices of user nodes and news nodes, respectively.According to previous works [10,17,30], the feature vectors of users and news can be composed of their text representations or handcrafted features.Following [10], we use the 300-dimensional Glove embeddings of users' historical posts and news post text to represent X  (, :) and X  ( , :), respectively.We use X to represent all node features for convenience.  ∈  represents the label of   ∈  where 1 (0 resp.) represents fake news (real news resp.).
To detect fake news based on G, a general GNN framework [15,45] can be applied to it.Concretely, to learn a news post 's representation, a GNN aggregates its neighbors' information recursively: where  is the GNN layer number.AGG is the GNN aggregator that aggregates neighbor embeddings.Common aggregators employ attention [45], mean [15], and summation [15].⊕ represents the operation that combines the embedding of  at the last GNN layer and its aggregated neighbor embeddings.Common approaches include concatenation and summation.Similarly, the representation of  can be learned by the same process shown in Eq. (1).
To classify the news post  ∈  , a GNN classifier  takes the G as input where   and   are node features of  and .It maps  ∈  to  ∈ (0, 1) after feeding the ℎ  at the last layer to an MLP and softmax layer.The GNN classifier can be trained on partially labeled post nodes with the following cross-entropy loss in a semisupervised fashion:

Adversarial Attacks on GNN-based Fake News Detectors
At a high level, our problem can be regarded as attacking GNNbased node classification but with practical constraints to simulate real-world misinformation campaigns.Specifically, the objective of the attacking method is to flip the GNN classification results of target social posts via maneuvering controlled malicious social media user accounts to share new posts.Note that we assume attackers can only perturb the graph by controlling malicious users to share news posts and not delete existing shared news posts.
We make this assumption because in a real-world setting, even though the users can delete existing shared posts, the record of shared relations may still exist in the database.Considering the massive social network data and the diverse fake news detectors employed by the platform, we assume the unknown target GNN is pre-trained on clean data in our problem setting (i.e., the training data is not poisoned by the adversary).Also, we assume that we have knowledge about the type of GNN the detector is trained on, but we do not have access to its model parameters.Thus, our problem is a grey-box evasive structural attack on the GNN-based node classification task.We formally define our attack objective as: max where   ,   , and   represent a set of controlled users, manipulated edges, and target news posts, respectively.G represents the clean graph and  ′ is the set of perturbed edges.Δ  (Δ  resp.)represents the budget of controlled users (modified edges resp.).The above adversarial objective essentially maximizes the misclassification rate of the target social posts.

METHODOLOGY
In practice, misinformation campaigns are carried out by coordinated fraudsters manipulating social user accounts to evade detection.In this section, we first elucidate the property of attackers motivated by real-world misinformation campaigns.Then we present the multi-agent reinforcement learning framework we use to probe the robustness of GNN-based fake news detectors.and post nodes as well as their connections.The above setting is practical since social media information is publicly accessible and the attackers can easily infer node features and labels given fruitful related works in misinformation research.

Attacker Capability.
To imitate the real-world behavior of fraudsters as much as possible, we define the capability of users controlled by fraudsters (i.e.,   ) as follows: In real-world settings, given that a controlled user shared many posts from trustworthy sources that seemed to be legitimate, it will help alleviate the suspiciousness of a fake news post if the user shares the post.• Indirect Attack: For  ∈   ,  ∈   , (, ) ∈ , we carry out the attack by controlling  to share  ′ ∉   .The indirect attack exploits the neighbor aggregation mechanism of GNNs by exerting influence on the target post by changing its neighborhood.In practice, for a controlled user having shared the target post with fake news, one can let the controlled user share posts from trustworthy sources to mislead the GNN's prediction on the target fake news post.Note that as previously mentioned in Section 3.2, attackers are only allowed to add edges between user nodes and news post nodes.

Agent Configuration.
Real-world evidence shows that multiple groups of malicious actors are engaged in the misinformation campaign [32,37,40].Besides singletons who act individually, most misinformation campaigns are executed coordinately by professional agencies since it would reach the campaign goal faster while maximizing the utilities of existing resources.From the adversarial attack perspective, different types of controlled user accounts have distinct influences on target posts and different budgets.For instance, the bot users are usually low-cost and with a higher budget.However, these bot users have few historical records; thus each bot user has limited influence on target posts [37].The crowd workers with credible and rich social profiles are usually expensive, but they have a stronger influence on target posts.
To model the above distinct malicious actor groups, previous single-agent RL frameworks are not applicable [8,43].Therefore, we leverage a MARL, which not only enables the personalized configuration for each group but also helps simulate the coordinated behavior between different groups.Specifically, we define three agents which control three distinct groups of user accounts according to the malicious accounts introduced in [40].We divide the user accounts based on the number of news they have shared, Figure 2 shows the distribution of the number of news that users have shared in Politifact and Gossipcop datasets.Table 1 compares the key properties of the following agents.1) Agent 1 (Social Bots): Social bots registered and fully controlled by automated programs have been proven to engage in fake news spreading by many works [2,37].The first agent controls the bot users, and it has a low cost and high budget.We randomly select the users with only one connection in our datasets to represent the newly created bot users.2) Agent 2 (Cyborg Users): According to [40], cyborg users are registered by humans and partially controlled by automated programs.The easy switch of functionalities between humans and bots offers cyborgs unique opportunities to spread fake news.Since those users are camouflaged as human, they usually have more historical engagements (i.e., connections to other posts).In our datasets, we randomly select the users with more than 10 connections to represent the compromised users.The cost, budget, and influence of cyborg agents are between that of the other two agents.
3) Agent 3 (Crowd Workers): The crowd workers are usually of high cost since they get paid for each campaign.Meanwhile, they have the strongest influence.We take the users with more than 20 connections, where 100% of them connect to real-news posts to represent the crowd workers.

Attack Framework
In the real world, each agent above represents a malicious actor that aims to influence the fake news classification results.Given a set of target news posts   , the attack process can be modeled as a multi-agent cooperative reinforcement learning problem where all agents work together to maximize the misclassification rates of target news posts.Figure 3 illustrates the attack process of the proposed MARL algorithm.First, actions made from different agents are aggregated by the center controller; then, aggregated actions are applied to the environment composed of the social engagement graph and the surrogate classifier; the updated state and rewards generated by the classifier are finally sent back to each agent for the next episode of optimization.In this subsection, We first define each component of the MARL framework, then introduce how we leverage deep Q-learning for optimization.

MARL Framework.
Different from previous GNN attacks using single-agent RL, which can be modeled as a Markov Decision Process (MDP) [8,43], the MARL framework is a Markov game (MG).We formally define the MG and its components as follows:
• Action.As defined in Section 4.1.2,each  ∈   can only add edges based on their connection status to  ∈   .Meanwhile, each agent controls a set of users    according to Section 4.1.3.We use    (, ) to denote the action that adds the edge between user  and post .Thus, the action space for agent  at time  is    ∈ A  ⊆    ×  .We use a centralized controller to aggregate agent actions.Specifically, in each episode, the final actions are from the three types of agents with a fixed proportion, this proportion is motivated by real-world misinformation campaigns.
• State.Since all agents work cooperatively to attack the same set of target posts   against the same classifier  , all agents share the same state at time  represented by (G ′  ,  ), where G ′  is the perturbed graph at time .
• Reward.As a grey-box attack, we aim to flip the classification results of the target classifier.Since we have knowledge of the GNN architecture of the detector, we use one of the three GNN models (i.e.GAT, GCN, and GraphSAGE) as our surrogate target classifier and take its classification results on   as the reward to guide the agent.Note that the reward is shared by all agents under the cooperative setting.After all agents make the actions under their budgets (i.e., one episode), the reward for each agent towards the target post  ∈   is: • Terminal.After each agent makes finite modifications according to their own budget Δ   , the Markov game stops.4.2.2Deep Q-Learning.To solve the above Markov game, we need to find the optimal policy that maximizes the expected value of long-term rewards.Since each agent has its own controlled user accounts and budget, each agent  should have its own policy   that    ∼   (•|  ).We use the Q-learning to learn the optimal policy  , * parameterized by a Q-function  , * (  ,   ).The optimal Q-value for agent  can be represented by the following Bellman equation: where  ,′ represents agent 's future action based on state   .The above equation suggests a greedy policy where the agent 's best action based on   is the action that maximizes the Q-value above: For each target post  ∈   , we would like to choose the controlled user  ∈   with the most influence to  to flip the GNN classification result on .Thus, using GNNs to parameterize the Q-function could help model each action's value.Specifically, we first employ a two-layer GraphSAGE [15] to obtain the embedding of each post node ℎ , in current state   according to Eq. ( 1).Note that we only consider the 2-hop neighborhood of all target nodes and controlled user accounts, which could reduce the state and action space.For agent  at time  with embeddings of controlled user accounts ℎ , ,  ∈    and the target node ℎ , ,  ∈   , the Q-value of action    (, ) is calculated by the following equation: where two liner layers are applied on two end node embeddings before computing their dot product which yields the Q-value of the given action.

EXPERIMENTS
As mentioned in Section 4.1.2,indirect attacks do not modify the edges between user nodes and target news nodes directly and are more likely to be used by attackers in practice to evade detection.In this section, we conduct a series of experiments to validate the effectiveness of our proposed framework under the more realistic indirect settings and analyze the factors that affect MARL's attack performance.Then, we present experiment results to compare direct attack against indirect attack.Finally, we discuss some countermeasures that could be used by defenders.Specifically, we aim to answer the following research questions: • RQ1: How does the performance of MARL compare to baselines?

Experiment Settings
In this subsection, we introduce the experiment settings for MARL indirect attacks.We first introduce the datasets, surrogate models, and baseline methods used for the experiments, then we introduce the implementation details.
5.1.1Datasets.We extract two social engagement graphs from the FakeNewsNet [38] dataset composed of the metadata of fake and real news posts and their engaged users on Twitter from two fact-checking sources: Politifact and Gossipcop.Following [10], we take Glove 300D [33] embedding of a user's historical tweets as its feature and the Glove embedding of the associated news content of a social post as its features.Note that our attack operates the controlled users to share posts; since the number of changed edges for a user is within a tight budget, we assume all node features are unchanged during the attack process.
5.1.2Surrogate Models.Under the grey-box setting, the attacker only has information about the architecture of the model being attacked.Thus the attack has to be performed on a surrogate model M that has the same GNN architecture as the target model.For the GNN-based fake news detectors, we include three classic GNN models.Specifically, we use Graph Convolution Network [22], Graph Attention Network [45], and GraphSAGE [15] as our M. Table 3 shows the performance of these surrogate models on Politifact and Gossipcop.We trained these models to ensure that they have similar performance across both datasets.So that we can measure the attack performance of MARL comparably.

Baseline Attack Methods.
Due to the attacker's limited capability and restricted candidates of both controlled users and target posts, we cannot take the feature and gradient-based attacks [50] as baselines.To compare the effectiveness of the proposed MARL framework, we compare it with the following baselines: • Random-Edge (RD-Edge): This is a simple baseline that randomly selects the controlled users and target posts to add edges until meeting the budget.• Random-Node (RD-Node): This baseline injects new user nodes into the graph and connects them with the target news nodes.
• Single Agent RL: To demonstrate the effectiveness of MARL, we created this baseline by limiting the attacker to a single type of agent.Specifically, we have three baselines named RL-A1 (Bot), RL-A2 (Cyborg), and RL-A3 (Crowd Worker).
Budget and Target Selection Criteria For the Politifact dataset, we randomly sampled 100 bot agents, 50 cyborg agents, and 20 crowd worker agents from Table 1.For the Gossipcop dataset, we randomly sampled 1,000 bot agents, 500 cyborg agents, and 100 crowd worker agents.Implementation Details For Random-Edge method, we connect edges between sampled attack agents and news targets based on the agent node's degree.Specifically, we randomly connect bot agents with 1 news target, cyborg agents with 3 news targets, and crowd worker agents with 5 new targets.For Random-Node method, we add 5 user nodes for each of the three agent types.We generate the embedding for each node by randomly sampling 20 nodes from each type of agent, and taking the average of their embedding as the new embedding for the injected node.We connect the generated user nodes with target news nodes the same way in the Random-Edge method.We use PyG [13] to implement all GNN algorithms.The MARL algorithm is implemented based on the RL-S2V code provided by [8].Our code and data are publicly available 1 .
Performance Metrics Since we only aim to flip the classification results of a selected group of target posts, we use the success rate (SR) as the metric to evaluate the attack performance, which is the number of misclassified posts divided by the total number of target posts after the attack.

RQ1: Performance of MARL
Since attackers are more likely to use indirect attacks than direct attacks to evade detection in practice, we study targeted indirect attacks on both fake and real news in Politifact and Gossipcop.rate of indirect attacks on fake news that falls into the decision boundary of [0.5, 0.7] to those outside of the decision boundary, we see an increase in success rate from 0.18 to 0.33 when attacking fake news in Politifact dataset on GAT detectors.• Another interesting finding is that GCN is more sensitive to edge perturbations compared to GAT and GraphSAGE.Attackers can achieve fairly good performance on GCN with only a small amount of edges added to the graph.For instance, we can reach a success rate of 0.48 with just 210 edges added when attacking fake news on Politifact.Comparably, with the same amount of attack budget, the success rate on GAT and GraphSAGE detectors are 0.12 and 0.14 respectively, which are much lower than GCN.
Previous works [7,8] have also shown that GCNs are vulnerable to structural adversarial attack due to the low breakdown point of their weighted mean aggregation method.

RQ2: Attack Performance Analysis
In this subsection, we answer RQ2 and provide insights on the factors that affect MARL's attack performance.Specifically, we provide analysis based on agent types and news types.

Agent Types.
Figure 4 shows the ablation study results on single agent RL.Specifically, we use RL-A1, RL-A2, and RL-A3 methods to carry out targeted indirect attacks on fake news in Politifact and Gossipcop datasets with increasing attack budgets by using more agents.We make the following observations: • The overall attack performance increases with incremental attack budget for all three types of agents.• The performance gain slows down after hitting a threshold.
Therefore, attackers need to select the optimum number of agents to perform indirect attacks.• Crowd worker agents achieve better performance than bot and cyborg agents on all three GNNs given the same amount of attack budgets.This is expected since crowd worker agents have stronger influence and their social posts are connected to real news.Therefore, they exert more influence on fake news.
Based on the above observation, we divide user nodes into "good" and "bad" groups.Specifically, we put users who have more than 80% of the news they shared being fake into the "bad" group and users who have less than 20% of the news they shared being fake into the "good" group.

News
Types.Intuitively, we conjecture that the news post node with a higher degree is more robust to attacks than those with lower degrees.To verify this hypothesis, we attack different groups of fake news in Politifact and Gossipcop according to their node degree.For this experiment, we use 10 crowd worker agents for Politifact and 50 crowd worker agents for Gossipcop respectively.As shown in Table 5, it is significantly harder to attack news with a higher degree across all three GNNs.Even on the most vulnerable GNN (i.e.GCN), MARL has a significant performance decrease (80% on Politifact and 90.9% on Gossipcop) when attacking fake news with a degree of less than 10 compared to the news with a degree more than 100.Another observation is that GAT is more robust than GCN and GraphSAGE on news with degree between 10 and 100.As shown in Table 5, GAT only has a performance drop of 12.5% and 21.4% on Politifact and Gossipcop respectively when increase news degree from less than 10 to less than 100.Whereas the performance drop of GCN and GraphSAGE are almost halved on both datasets.

Gossipcop
Direct Indirect This is likely due to the attention mechanism of GAT making it less sensible to degree changes.

RQ3: Direct vs. Indirect
Recall from Section 4.1.2,direct attacks mean that attackers modify the edges directly linked from user nodes to target news nodes.
Although direct attacks are more obvious in real-world scenarios, they can achieve better performance than indirect attacks due to the direct perturbation of graph structure.For this experiment, we sampled 25 good users for Politifact and 250 good users for Gossipcop to perform targeted attacks on fake news in both datasets.
Figure 5 shows the comparison between direct and indirect attacks.We can see that direct attack improves the performance on GAT by 81.8% and 147.6% on both datasets respectively.However, we see a decrease in performance on GCN and GraphSAGE detectors across both datasets.Especially on GCN detectors in the Politifact dataset.
Based on the observation from Figure 5, we are interested in whether the performance of direct attack behaves similarly across news with different degrees.For this experiment, we categorize fake news in Politifact and Gossipcop based on their news degrees: news with less than 10 tweets is classified as low; news with more than 100 tweets is classified as high; and news in between as mid.We use the same agent configuration as in Figure 5 to attack fake news in both datasets.Figure 6 shows that direct attack is effective on news with low degrees on GAT detectors, while it is less effective on GraphSAGE detectors regardless of the news degrees.

RQ4: Countermeasures against Attacks
Based on our experiment findings, we discuss the countermeasures for fraudsters that manipulate news social engagement from two perspectives.1) From the machine learning security perspective, there are fruitful research works on defending against graph adversarial attacks [42].Approaches like adversarial training [12], anomaly detection [11], and robust GNN models [14,20] can be leveraged to defend the attacks.2) From the practical perspective, social media platforms should pay equal attention to both "bot" and seemingly "good" users.As shown in the experiments, attackers can leverage users' good posting history to carry out a successful targeted attack on fake news to foul GNN-based detectors.Since the indirect attack is effective against many GNN detectors, this suggests that the platform should monitor more engagement activities of accounts engaged with the target news instead of the target news itself.Experiment results also show that there is no universally robust model which prompts the platform to adopt diverse trust and safety models.

CONCLUSION AND FUTURE WORK
In this paper, we aim to understand the vulnerability of graph neural network-based fake news detectors under structural adversarial attacks.To the best of our knowledge, this is the first work to attack GNN-based fake news detectors.This paper aims to provide insights on how to develop a more robust GNN-based fake news detector against adversarial attacks in the future.We leveraged a multi-agent reinforcement learning framework to mimic the attack behavior of fraudsters in real-world misinformation campaigns.
Our experiment results show that MARL improves overall attack performance compared to our baselines and is highly effective against GCN-based detectors.
Even though we have some promising results from the experiments, this paper has two major limitations: 1) This work only employs a simple heuristic to select users for action aggregation.
2) The search space of the Q network is considerably large and results in a high computational cost on larger datasets like Gossipcop.Therefore, there are several interesting directions that need further investigation.The first one is to automate the process of selecting optimal agents for action aggregation.The second one is to reduce the deep Q network's search space effectively.Finally, we used a vanilla MARL framework in this paper.It would be interesting to explore a more complex MARL framework for this task.

ETHICAL STATEMENT
The Twitter data used in this paper are obtained from Twitter API and meet Twitter user agreement.Although we proposed an adversarial attack framework against GNN-based fake news detectors, our intention is to probe and enhance the robustness of existing detectors.Therefore, we do not endorse this work to be used for unethical purposes in any shape or form.

FFigure 1 :
Figure 1: An illustration of attacking a fake news classifier via manipulating news posts' social engagement.The classifier misclassifies the fake news after the fraudster who has shared many real news posts shares it.

4. 1 . 1 Figure 2 :
Figure 2: The user account number distribution according to the amount of news shared by them.We sample accounts in different ranges to represent different user types.

2 tFigure 3 :
Figure 3: The proposed MARL framework to generate adversarial edge perturbations against GNN-based fake news classifier.See Section 4.2 for more details.

Figure 5 :
Figure 5: Comparison between the direct and indirect attack on Politifact and Gossipcop on fake news with degrees less than 10 using "good" users.

Figure 6 :
Figure 6: Results of direct attack across different types of news based on their degrees.A brighter color suggests better attack performance.

Table 1 :
The comparison of properties among bots, cyborgs, and crowd worker agents.

Table 2 :
Dataset statistics and agent configurations for Politifact and Gossipcop datasets.

Table 3 :
Performance of surrogate models measured by accuracy and F-1 score.

Table 4 :
Results of using MARL to perform indirect targeted attacks comparing to several baselines.Experiments are repeated five times, and the average success rate is reported.

Table 4
reports the attack performance of MARL compared to the baselines.From the table, we make the following observation: Indirect attack performance of different types of agents on fake news in Politifact and Gossipcop datasets.Performance on GAT, GCN, and GraphSAGE are marked in blue, red, and green respectively.

Table 5 :
Indirect targeted attack on fake news in Politifact and Gossipcop based on their news degrees using crowd worker agents.