Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection

Despite considerable advances in automated fake news detection, due to the timely nature of news, it remains a critical open question how to effectively predict the veracity of news articles based on limited fact-checks. Existing approaches typically follow a"Train-from-Scratch"paradigm, which is fundamentally bounded by the availability of large-scale annotated data. While expressive pre-trained language models (PLMs) have been adapted in a"Pre-Train-and-Fine-Tune"manner, the inconsistency between pre-training and downstream objectives also requires costly task-specific supervision. In this paper, we propose"Prompt-and-Align"(P&A), a novel prompt-based paradigm for few-shot fake news detection that jointly leverages the pre-trained knowledge in PLMs and the social context topology. Our approach mitigates label scarcity by wrapping the news article in a task-related textual prompt, which is then processed by the PLM to directly elicit task-specific knowledge. To supplement the PLM with social context without inducing additional training overheads, motivated by empirical observation on user veracity consistency (i.e., social users tend to consume news of the same veracity type), we further construct a news proximity graph among news articles to capture the veracity-consistent signals in shared readerships, and align the prompting predictions along the graph edges in a confidence-informed manner. Extensive experiments on three real-world benchmarks demonstrate that P&A sets new states-of-the-art for few-shot fake news detection performance by significant margins.


INTRODUCTION
The proliferation of fake news online poses an imperative concern for human cognition [8,38] and social development [35,47].Given the timeliness trait of news stories [9], it is crucial that automated fake news detection applications enable accurate few-shot veracity predictions based on limited related fact-checks.
Nevertheless, the success of existing approaches is usually contingent on access to abundant fact-checked articles and auxiliary features, which is not guaranteed in practice.Regardless of whether utilizing the news content [44,57] or the social context graph [4,31], the majority of methods adopt a "Train-from-Scratch" paradigm, where weight optimization is solely dependent on the supervised training data.Consequently, these methods typically require largescale labeled news articles [37,45] and incorporate auxiliary information that are laborious to retrieve, including stance annotations [31], user history posts [6] and knowledge bases [4,7,15].Under label scarcity, these methods suffer from generalization issues.To address this challenge, pre-trained language models (PLMs) [5,28] have been adapted to the task following a "Pre-Train-and-Fine-Tune" paradigm [3,24,33], where a task-specific classification head is stacked upon a PLM.While methods under this paradigm benefit from the pre-trained syntactic and semantic knowledge, optimizing the auxiliary layers still requires abundant high-quality annotations, and fine-tuning a PLM alongside a randomly initialized taskspecific architecture has been shown to distort the high-quality pre-trained features and impair model robustness [14,22].Under both paradigms, the real-world label scarcity of emerging news events creates a fundamental hurdle for weight optimization, resulting in substantial performance degradation.
In this work, we develop "Prompt-and-Align" (P&A), a novel prompt-based paradigm for few-shot fake news detection.Inspired by recent advances in prompt-based learning that exploit PLMs as powerful few-shot learners in various natural language processing tasks [25,34,36,40], we re-formulate fake news detection as a task-oriented text completion problem embedded in a natural language prompt.As illustrated in Figure 1, in contrast to existing "Train-from-Scratch" and "Pre-Train-and-Fine-Tune" paradigms that incorporate task-specific architectures, P&A utilizes a textual prompt that encodes task-related knowledge.Prompting establishes semantic relevance between the task and the PLM [27], elicits the latent "built-in" knowledge from PLMs for task-specific inference [18], and thereby effectively alleviates the label scarcity bottleneck.
On the basis of few-shot prompting, how can we further incorporate knowledge from the social context?An intuitive solution would be to combine prompting outputs with a Graph Neural Network (e.g., GCN [20]) on the social graph.However, this does not address the problem, as the significant gap between the PLM pre-training objective and the downstream GNN classification objective can impair model performance.Across a sparsely labeled social graph, only a small portion of unlabeled nodes are involved in message passing, which fundamentally limits the effectiveness of the GNN.
To leverage informative structural patterns from the social context without inducing additional training overheads, we investigate the news consumption preference of social media users, and make a key observation on user veracity consistency, where users are consistently attracted to news articles of a certain veracity type (i.e., real or fake).In other words, active social users connect multiple news articles with veracity-consistent signals via user engagements.Motivated by this empirical finding, we construct a news proximity graph to connect news articles with shared readership.To fully utilize the limited number of high-quality labels and high-confidence predictions, we first enhance the prompting predictions with ground-truth training data labels and soft labels generated via pseudo labeling [23], and then align the independent predictions over the veracity-consistent graph edges, which further explicitly regularizes predictions of unlabeled samples to alleviate label scarcity.
To summarize, our contributions are three-fold: • Empirical Finding: We present a novel finding, on how user veracity consistency leads to shared veracity between news articles with large common audiences.• Method: We propose "Prompt-and-Align" (P&A), a novel paradigm for few-shot fake news detection that combines the benefits of both prompting and veracity-guided social alignment.

RELATED WORK
Our work brings together two active lines of research.

Fake News Detection on Social Media
Deep learning models have shown impressive capacity for learning news representations.Existing methods can be generally categorized into two paradigms based on their training schemes: The dominant "Train-from-Scratch" paradigm employs various neural architectures including Recurrent Neural Networks (RNNs) [39,44,56], Convolutional Neural Networks (CNNs) [54,57] and Graph Neural Networks (GNNs) [30,49,55] to learn semantic and structural representations.To further enhance model prediction, existing methods incur high costs in retrieving various sources of auxiliary information including large-scale user response [39], entity descriptions from knowledge bases [7,15], open-domain evidence [43], news producer description [31] and social media profiles [6].Utilization of PLMs [3,24,33], specifically fine-tuning them on annotated news samples, opened a new era of "Pre-Train-and-Fine-Tune" with enhanced generalizability.However, existing methods typically incorporate task-specific architectures alongside a PLM, which require abundant labeled samples to optimize.Although initial progress has been made in meta-learning based multimodal fake news detection [51] involving news articles and images, the informative social context remains unconsidered.Hence, in this paper, we study the task of few-shot fake news detection on social media, which aims to effectively predict news veracity given a small number of annotated news samples and minimal social context (i.e., the numeric IDs of related social users).

Prompt-Based Learning
Recent years have realized PLMs [1,5,28] as a strong impetus for advancing natural language understanding.Among PLM-based approaches, prompt-based learning has shown promising potential under few-shot scenarios.As per the investigations of [18,34], PLMs acquire abundant factual and commonsense knowledge during their pre-training stage, which can be elicited by re-formulating downstream tasks into text completion questions, either with manually pre-defined templates [40,48] or with patterns learned on a small training set [25,36].Guided by this knowledge, promptbased methods have achieved impressive breakthroughs in a wide range of tasks including graph learning [13], sentiment classification [10], natural language inference [41], relation extraction [2], and stance detection [12].Despite preliminary investigation into knowledge-based prompting for fake news classification [17] (i.e., news documents augmented by auxiliary entity descriptions from a knowledge base as the PLM input), existing efforts overlook the informative social graph topology.On the related front of rumor detection, recent work leverages prompt learning for zero-shot transfer learning [26].However, [26] focuses on learning languageagnostic contextual representations across different domains, which is inherently orthogonal to our contributions.In this work, we propose a novel "Prompt-and-Align" (P&A) few-shot paradigm that addresses the generalization challenge by prompting the PLMs for task-related knowledge, and further alleviates the label scarcity issue via incorporating news readership patterns.

PROPOSED APPROACH
In this section, we present our proposed "Prompt-and-Align" (P&A) paradigm for few-shot fake news detection, overviewed in Figure 2. P&A consists of two major components, namely (1) a "Prompt" component (Section 4.1) that elicits task-specific knowledge from the PLM to predict news article veracity; and (2) an "Align" component motivated by our empirical finding on user veracity consistency (Section 4.2), which leverages the informative news readership patterns to enhance the prompting predictions via confidenceinformed alignment over the social graph (Section 4.3).

Prompting for Task-Specific Knowledge
The crucial factor in effectively connecting the PLM pre-training objective (i.e., masked language modeling) with the downstream fake news detection objective lies in identifying a shared template applicable to both tasks.To this end, we first re-formulate fake news detection as a masked token prediction problem.
Let M be a PLM with a vocabulary V. Contrary to standard classification methods, which assign news articles to a label logit without any inherent meaning, prompting enables us to solve the fake news detection problem as follows: Given a news article   , we firstly construct a corresponding natural language prompt T containing a mask token: Taking T as input, M produces a score vector  ( T ) ∈ R | V | over V at the masked position to complete the text.For token  ∈ V, the corresponding score logit   ( T ) is computed as: Here, the PLM is prompted to answer what is the most likely token that fits the [MASK] in T .
Given  ( T ), we define the answer space A ={"news", "rumor"} as a small subset of the vocabulary containing permissible answers for our fake news detection prompt.Then, we map A to the binary labels following "news"→ 0 (real) and "rumor"→ 1 (fake), respectively.
To retrieve the conditional probability assigned to each answer token, we convert  ( T ) into a conditional probability distribution  ( T ) ∈ R | V | over all vocabulary tokens via the softmax function.For answer token  ∈ A, we extract its probability score   ( T ): Intuitively, our prompt-based approach incorporates task-specific semantic knowledge by prepending a task-related textual template to the input, and queries the PLM for whether it associates an article's text with news or rumor, by asking it to fill in the masked token with either "news" or "rumor".This helps directly elicit the model's task-specific knowledge, which significantly reduces the amount of training data it needs to learn effectively.Training objective.The highest-scoring token in answer space A implies the PLM's prediction of news veracity.Hence, given the PLM M and our textual prompt T , the goal is to maximize the probability score of the correct answer token  + and penalize the incorrect token  − .For instance, real news have  + as "news" and  − as "rumor".The cross-entropy (CE) loss over answer space A is unsuitable here, as it only focuses on the two logits that correspond to the two answer tokens in A. Hence, it fails to suppress the logits corresponding to non-answer tokens (e.g., task-irrelevant tokens such as "coffee" can be assigned a large logit for filling in [MASK]), resulting in a suboptimal learned model.If we compute the CE loss over the entire vocabulary, this loss does not distinguish between  − and other non-answer tokens, and thus cannot specifically suppress  − .To address these issues, inspired by the decoupling label loss [48], we adopt a loss function that simultaneously discourages non-answer tokens and keeps a specific focus on the answer token probabilities extracted from  ( T ) (see Eq. 3).The loss can be formulated with the binary cross entropy (BCE) loss as: where  is the number of annotated training samples.
Base prediction acquisition.To adapt the pre-trained weights to our fake news detection problem, we tune the parameters of M by minimizing the above loss function (Eq.4) with the textual prompt, and retrieve a set of "base prediction" scores w.r.t. the  class label tokens (mapped from the answer tokens) for all  news articles in our dataset, denoted as  A (T ) ∈ R  × .As the scores are extracted from a probabilistic distribution over all vocabulary tokens, we perform a softmax function to focus on the distribution over our task-specific answer space A, formulated as a base prediction matrix P ∈ R  × : At this point, we have  independent prompting predictions obtained by probing the PLM with task-specific natural language prompts.Despite the rich semantic knowledge encapsulated in the PLM, the PLM does not contain social knowledge in terms of social graph topology.This motivates our innovations in further integrating auxiliary social context knowledge, specifically via distilling crowd wisdom from news readerships.

Veracity-Consistent Readership Modeling
Social context, usually in the form of graphs, constitutes a distinctive property of the fake news detection problem [46].In this subsection, we conduct preliminary analysis of real-world news engagements by social media users, with the aim of investigating the veracity-related social properties.Specifically, we explore the following question: are there any connections between social users' news engagements and news veracity?To measure the social users' news consumption preference in terms of news veracity, we compute a "fake news affinity" (FNA) score for user  ∈ U, defined as: Definition 1 (Fake News Affinity).
Empirical observation.To capture patterns representative of user engagement preference, we set a user engagement threshold   = 5, and focus solely on the active social users with at least   engagements in spreading news articles.In Figure 3, we visualize the FNA score distribution of active social users with a histogram, and make the following observation: Observation 1 (User Veracity Consistency).Active social users with numerous news engagements tend to have FNA scores either approaching 0 (only engage in spreading real news) or 1 (only engage in spreading fake news).
Our observation echoes the confirmation bias theory [32] for news consumption, where people tend to seek and retain information that reinforces their prior beliefs.Specifically, in the fake news detection task, confirmation bias can be manifested in terms of coordinated group behavior [42], specifically with the aim to manipulate opinions on social platforms.For instance, social media has been shown to create opinion polarization towards political events [21], and users can coordinate boundaries to gain control over celebrity discussions on gossip sites [29].Consequently, user attention tends to become highly segregated on certain opinions, resulting in repeated engagements in news articles of a similar veracity type.
As an active social user engages in spreading numerous news pieces, the set of engaged news articles is connected with veracityconsistent signals.Therefore, if two news articles are closely connected with a large shared readership, the articles are highly likely to have the same veracity label.News Proximity Graph.Guided by our empirical observation, we construct a news proximity graph G that encodes the shared readerships between new article nodes T .Let A T ∈ R  × be the adjacency matrix of G that quantifies shared readerships, which we will next derive from the set R of user repost records.
To focus on the active social users, we set a user engagement threshold   to remove the users with fewer than   news reposts.From R, we construct a user engagement matrix B ∈ R | U ′ | × , where U ′ ⊆ U filtered via user engagement threshold   denotes the active user set, and element B  denotes the engagement intensity of user   ∈ U ′ towards news article  ∈ T .The value of B  is obtained from   in the corresponding entry (  ,  ,   ) ∈ R in the user engagement set.
On the basis of user engagement matrix B, we formulate the news proximity matrix A  ∈ R  × as B ⊤ B. Then, we conduct normalization on A  to derive:

Confidence-Informed Social Alignment
To inject social context into the prompting predictions, motivated by our empirical finding on user veracity consistency (Section 4.2), we construct a news proximity graph G to connect news article nodes with veracity-consistent signals from shared readerships.However, training a task-oriented GNN (e.g., GCN [20]) on G is inapplicable under label sparsity, as the involvement of unlabeled nodes is heavily limited during message passing, resulting in poor model performance.To fully exploit label knowledge and social graph structure without inducing additional trainable modules, we propose a social alignment component that combines the base predictions in a confidence-informed manner.In addition to prompting, our graph-based alignment further bridges the gap between general PLM pre-training and the downstream fake news detection task, which enhances the effectiveness of our approach.
4.3.1 Label Knowledge Acquisition.We first discuss how our alignment component acquires knowledge from the relatively "cheap" base predictions P derived from task-specific prompting.The key idea is that we leverage the ground truth training labels and a small subset of unlabeled samples (i.e., news articles) with high prediction confidence.To this end, we transform P into an enhanced confidence matrix H ∈ R  × .Specifically, H consists of H  ∈ R × and H  ∈ R ( −) × , which respectively refer to the enhanced confidence scores of labeled and unlabeled samples.H is computed via the following two steps.
Thresholded pseudo labeling.The prompting module of P&A enables access to abundant soft labels assigned by the PLM, presented in the form of base predictions P  for unlabeled data T  .To fully utilize the soft labels, we devise a thresholded variant of the pseudo labeling technique [23], denoted as ThresholdedPL(•).Given P  , we select a set of samples with high class probability scores equal to or above the   -th percentile of all predicted confidence scores in P  .Following this, we assign these high-confidence samples with one-hot labels w.r.t. the class with maximum predicted probability for each sample, while the remaining low-confidence samples are kept unchanged.We augment P  with the resultant pseudo labels to obtain enhanced confidence scores H  : 4.3.2Veracity-Guided Prediction Alignment.Knowledge-infused predictions in H facilitate effective utilization of the limited highquality ground truth labels and predictions.Hence, given H and the veracity-consistent edges of news proximity graph G, we conduct veracity-guided social alignment to combine the independent predictions in a confidence-informed manner.Specifically, we encourage label smoothness over the graph structure by aligning H via propagation over the -hop neighborhood of each news article node.From the propagated scores, we obtain the final aligned predictions Ŷ: among which the predicted class label of news article   ∈ T  is assigned by ŷ = argmax  ∈ {1,..., } Ŷ .
Here, the obtained ŷ aggregates social knowledge (specifically, veracity signals) through our proposed social alignment component, which thereby refines the base predictions derived from prompting.

EXPERIMENTS
In this section, we empirically evaluate our "Prompt-and-Align" (P&A) paradigm to investigate the following five research questions: • Few-Shot Performance (Section 5.

Experimental Setup
5.1.1Datasets.We conduct evaluation on three real-world benchmark datasets commonly adopted by existing work, namely the FakeNewsNet [45] public benchmark consisting of the PolitiFact and GossipCop datasets, and the FANG [31] dataset.All datasets contain news articles collected from leading fact-checking websites and the related social user engagements (i.e., IDs of repost users) retrieved from Twitter.The dataset statistics are shown in Table 1.
We split the data into training and test sets by randomly sampling  news items as the training data (with  ∈ {16, 32, 64, 128}), among which the ratio of (fake news) / (real news) is set as 1 : 1.

Baselines.
We benchmark P&A against ten representative baseline methods that adopt the following paradigms: "Train-from-Scratch" methods devise task-specific neural architectures for fake news detection.dEFEND\c is a variant of the hierarchical attention framework dEFEND [44] without user comments.SAFE\v is a variant of SAFE [57] without the visual component, which adopts a TextCNN [19] based module to encode the news article texts.SentGCN and SentGAT [49] are graph-based approaches that respectively employ the Graph Convolutional Network (GCN) [20] and the Graph Attention Network (GAT) [50] to capture indicative sentence interaction patterns.GCNFN [30] utilizes deep geometric learning to model news dissemination patterns along with textual node embedding features.FANG [31] constructs a heterogeneous social graph with news articles, sources and social users, and adopts a GraphSAGE [11] based framework to detect fake news.For a fair comparison, social context based methods are implemented with the components for encoding the news content, user-news relations and social user identities.
PLM-based methods leverage the rich pre-trained knowledge in PLMs to mitigate label scarcity.Among this category, "Pre-Trainand-Fine-Tune" methods BERT-FT and RoBERTa-FT respectively combine BERT [5] and RoBERTa [28] models with a task-specific MLP to predict news veracity.Prompt-tuning methods include PET [40], which provides task descriptions to PLMs for supervised training via task-related cloze questions and verbalizers; and KPT [16], which expands the label word space with class-related tokens of varied granularities and perspectives.For a fair comparison, we do not implement self-training and PLM ensemble for PET.We utilize the base versions of BERT and RoBERTa for fine-tuning, and adopt BERT-base as the backbone of prompt-tuning baselines, consistent with the setup of our proposed approach.
As P&A focuses on news content and social context, we do not compare P&A with knowledge-based approaches [7,15,17] that incorporate entity information from external knowledge bases, which are orthogonal to our contributions.

Implementation Details.
We implement P&A and its variants based on PyTorch 1.8.0 with CUDA 11.1.We utilize pretrained BERT-base weights from HuggingFace Transformers 4.13.0 [53].The maximum sequence length, batch size, and learning rate are set to 512, 16, and 5 × 10 −5 respectively, consistent with [5].We fine-tune the models for 3 epochs for  ∈ {16, 32}, and 5 epochs for  ∈ {64, 128}.In the "Align" module, we set the pseudo labeling threshold   at the 95-th percentile among all test data predictions, and set the user engagement threshold   to 5. The number of alignment steps  is set to 2. For the baseline methods, we follow the architectures and hyperparameter values suggested by their respective authors.In all experiments, we report the average test accuracy (%) across 20 different runs of each method.

Few-Shot Detection Performance
Table 2 compares the performance of P&A with competitive baselines.We can observe that: (1) Among "Train-from-Scratch" methods, social graph based methods (FANG and GCNFN) consistently outperform news content based methods, which indicates the importance of incorporating social context topology.(2) PLM-based methods outperform "Train-from-Scratch" methods in numerous cases, showing that the informative pre-trained features in PLMs can help to address the label scarcity issue.(3) Among PLM-based methods, prompting methods (PET and KPT) consistently outperform the standard fine-tuned PLMs, which justifies our analysis that prompting directly elicits task-specific knowledge from PLMs.(4) P&A substantially outperforms the most competitive baseline.Given  training samples ( ∈ {16, 32, 64, 128}), P&A respectively enhances few-shot fake news detection accuracy by 8.93%, 11.01%, 6.36% and 5.74% on average.The statistically significant performance gains validate the effectiveness of eliciting task-specific knowledge with prompts and performing confidence-informed prediction alignment across the news proximity graph.

Ablation Study
To investigate the contribution of P&A components, we ablate two important components of our approach (thresholded pseudo labeling and social graph alignment) via the following variants: • P&A(−TPL), which removes the thresholded pseudo labeling mechanism on high-confidence samples (see Eq. 8).• P&A(−TPL − G), which eliminates the social context (i.e., our news proximity graph) and reduces to the "Prompt" component.
Note that when we only remove the graph-based social alignment steps (i.e., (−G)), the results are the same as (−TPL − G), as thresholded pseudo labeling does not invert any predicted labels.
Figure 4 shows the few-shot performance of different P&A variants on three datasets.We find that (1) Social alignment significantly contributes to overall performance, which indicates that aligning confidence-enhanced prompting predictions across our news proximity graph effectively utilizes veracity-consistent signals in the social context.(2) Removing the thresholded pseudo labeling mechanism leads to inferior performance than P&A, which is expected as removing pseudo labels exacerbates label sparsity on the social graph.(3) P&A(−TPL − G) (i.e., our prompting module) consistently outperforms BERT-FT, which is consistent with our analysis that tuning the PLM via task-specific textual prompts helps to directly elicit task-specific knowledge and alleviate label scarcity.

Parameter Sensitivity Analysis
We explore the sensitivity of three important hyperparameters in P&A: pseudo labeling percentile   for leveraging high-confidence predictions, user engagement threshold   for filtering inactive users, and alignment steps  over the news proximity graph.

Effects of Threshold Values.
To evaluate the impact of threshold values   and   , we investigate the performance of P&A with   ∈ {80, 85, 90, 95} and   ∈ {2, 3, 5, 10}, and present the results in Figure 5.We observe that: (1) The performance of P&A remains relevantly smooth across different variations of threshold values, with all combinations significantly outperforming the best baselines.Our default threshold values of   = 95 and   = 5 consistently give satisfactory performance, which indicates that our settings for the social alignment stage are practical for few-shot fake news detection on social media.(2) When we gradually lower the pseudo labeling threshold   , the performance of P&A slightly degrades.This is because   is formulated as a percentile, with which P&A automatically determines the numeric threshold based on the distribution of all prediction scores.Consequently, P&A assigns one-hot labels to more samples as   decreases, whose inherent uncertainty leads to error accumulation.

Effects of Graph Neighborhood Sizes.
As validated by our ablation study in Section 5.3, the veracity-consistent social context (i.e., news proximity) effectively enhances the prompting predictions.To investigate the structural news readership patterns embedded in the news proximity graph, we evaluate P&A performance over varying numbers of alignment steps (denoted as  in Eq. 9), with  ∈ {1, 2, 3, 4}.
As shown in Table 3, we find that: (1) The performance of P&A remains relevantly smooth across different variations of , with all combinations significantly outperforming the best baselines.Our proposed P&A adopts  = 2 alignment steps across the news proximity graph, which consistently yields satisfactory results.(2) In addition to 1-hop neighborhood (i.e., between news articles with direct overlaps of readerships), aggregating information from each   article's 2-hop and 3-hop neighborhoods on the news proximity graph can boost the performance of P&A, which indicates the effectiveness of higher-order social context modeling.(3) Compared with incorporating 2-hop and 3-hop neighbors, incorporating 4-hop neighborhoods leads to slight performance degradation.Indeed, large neighborhood sizes might not be helpful for fake news detection, as they contain intermingled veracity signals from the "fake" and "real" classes, which obstructs prediction alignment.

Impact of Prompt Design
To evaluate the effectiveness of the prompt-based paradigm utilized in P&A, we explore the performance of P&A across varying prompt templates (denoted as P1-3, respectively).For a fair evaluation, the mapping from prompting output to class labels is kept consistent across P1-3 as "news"→ 0 (real) and "rumor"→ 1 (fake).Note that we report the results of P&A using P1 in all other subsections.templates is reduced, indicating that the need for a well-designed task-specific template is greatly eased with more annotated data.

Case Study
To further illustrate why our P&A paradigm outperforms the most competitive baselines, specifically regarding P&A's effectiveness of prediction alignment over the news proximity graph, we conduct a case study involving a test sample from the PolitiFact dataset (Figure 6) that demonstrates P&A's capability of correcting base prediction errors.Given an evidently fake news article that claims "Donald Trump was pronounced dead this morning" (denoted as  3 in the Figure ), the news content based "Prompt" component of P&A mistakenly assigned it with a probability score of 0.596 for the "real" class, producing an incorrect base prediction.However, in the news proximity graph,  3 is closely connected to multiple 1-hop and 2-hop neighbors ( 2 , 4 , 5 , 6 ) with predictions that favor the "fake" class at different confidence levels, in line with our hypothesis that similar readership implies similar news veracity.Among this neighborhood subgraph, the highest weighted edge connects the misclassified  3 with a ground truth one-hot label at  2 .The edge weights of our news proximity graph are derived on the basis of the news articles' readership similarity; hence, P&A's prediction alignment across these edges facilitates prediction consistency in terms of  veracity.As we can see from the first alignment step that aggregates the prediction scores from each node's 1-hop neighbors, the high-confidence label knowledge from the ground-truth node  2 effectively propagated to  3 , correcting  3 's prediction scores to 0.148 for the "real" class and 0.199 for "fake".Besides this, the signals from other "fake" nodes ( 4 , 5 , 6 ) in the same subgraph are also reinforced via their respective 1-hop neighbors.Specifically, consider  4 , whose base prediction (i.e.class probabilities) only favors "fake" over "real" by a mere 0.02.Nevertheless, in the first alignment step,  4 benefited from the soft label knowledge from higher-confidence neighbors  5 and  6 , thereby gaining a more confident prediction of 0.151 for "real" and 0.245 for "fake".Similarly, in the second alignment step, the corrected "fake" prediction of  3 remains unchanged, and  3 's veracity-related signals are further enhanced, leading to improved results at the subsequent final classification stage.
Our case study shows that despite the abundant pre-trained knowledge in PLMs, base prediction errors can still arise due to the emergent nature of news topics.This highlights the necessity of supplementing the base predictions from PLMs with veracityindicative evidence from the social context.Through aligning the prompting predictions on a veracity-consistent news proximity graph, P&A effectively enhances fake news detection performance under label scarcity, and offers potential explainability in terms of users' news consumption preferences.

CONCLUSION AND FUTURE WORK
In this paper, we investigate the fake news detection problem under a practical few-shot setting.We introduce "Prompt-and-Align" (P&A), a novel prompt-based paradigm for few-shot fake news detection that jointly leverages pre-trained knowledge from PLMs and the veracity-indicative news readership patterns.P&A alleviates the label scarcity bottleneck by directly eliciting task-specific knowledge from a PLM via textual prompts, and encodes the veracityconsistent user engagement patterns with a news proximity graph.These innovations enable P&A to fully exploit the limited highquality labels and predictions, specifically via aligning the predictions over the news proximity graph in a confidence-informed manner.Extensive experiments on three real-world benchmark datasets validate the effectiveness of P&A in terms of consistent performance gains across varying hyperparameter settings and prompts.Our work demonstrates promising potential for moving beyond the existing paradigms for fake news detection, suggesting more focused research on generalizable prompt-based learning frameworks and veracity-aware social context modeling schemes under the few-shot scenario.

Figure 3 :
Figure 3: Social users exhibit a clear preference for either spreading fake news or real news.The fake news affinity score is formulated in Section 4.2.
2): How effective is P&A in few-shot fake news detection?• Ablation Study (Section 5.3): How do news content and social graph structure contribute to the performance of P&A? • Parameter Sensitivity Analysis (Section 5.4): How does P&A perform under different alignment steps, pseudo labeling percentiles, and user engagement thresholds?

Figure 5 :
Figure 5: Parameter sensitivity analysis of P&A on GossipCop under different combinations of   (pseudo labeling percentile) and   (user engagement threshold) values.Lighter color represents higher accuracy (z-axis, %).

Figure 6 :
Figure6: P&A corrects the base predictions via confidence-aware prediction alignment on the news proximity graph.This can be observed via the logits assigned to the class labels throughout different alignment steps.The node with a question mark represents a news article to be classified, presenting an empirical illustration of how P&A combines information from (1) prompt-based predictions, (2) confidence-informed label knowledge, and (3) social user engagements.For simplicity purposes, we only visualize the edges with highest weights for each node (the general illustration of P&A is presented in Figure2).
of size  and a large unlabeled test set T  .The articles in T  are annotated with the corresponding one-hot labels Y  = { 1 ,  2 , . ..,   }, where   ∈ R  .Input: news dataset D = {T , U, R}, training labels Y  ;Output: predicted labels Y  .

1. Prompt-Based Predictions
alignment over user engagements.News article nodes are connected by a news proximity graph (see Section 4.2), and darker node colors denote higher confidence.
Incorporation of ground-truth knowledge.To fully exploit the high-quality annotated training samples T  , we stack the one-hot training data labels Y  into a ground truth matrix Y  ∈ R × , and utilize the ground-truth class probabilities to replace the base predictions P  ∈ R × pertaining to the training data.The corresponding

Table 3 :
Accuracy (%) of P&A over different numbers of alignment steps on the news proximity graph.

Table 4 ,
we observe that (1) P&A variants across multiple templates consistently outperform the most competitive baseline, which validates the effectiveness of prompting in eliciting pretrained knowledge in PLMs for downstream tasks.(2) With the increase of labeled samples, the fluctuation across different prompt