Leveraging Prompt-Based Large Language Models: Predicting Pandemic Health Decisions and Outcomes Through Social Media Language

We introduce a multi-step reasoning framework using prompt-based LLMs to examine the relationship between social media language patterns and trends in national health outcomes. Grounded in fuzzy-trace theory, which emphasizes the importance of gists of causal coherence in effective health communication, we introduce Role-Based Incremental Coaching (RBIC), a prompt-based LLM framework, to identify gists at-scale. Using RBIC, we systematically extract gists from subreddit discussions opposing COVID-19 health measures (Study 1). We then track how these gists evolve across key events (Study 2) and assess their influence on online engagement (Study 3). Finally, we investigate how the volume of gists is associated with national health trends like vaccine uptake and hospitalizations (Study 4). Our work is the first to empirically link social media linguistic patterns to real-world public health trends, highlighting the potential of prompt-based LLMs in identifying critical online discussion patterns that can form the basis of public health communication strategies.


INTRODUCTION
During the COVID-19 pandemic, social media was at the center of proliferating mass antipathy and distrust towards government health policies and recommendations [26,55].Millions took to social media to oppose federal and state health practices, criticize medical professionals, or organize anti-vaccine and mask-wearing rallies [62].The viral growth of such online conversations fueled animosity and extremist views that encouraged people to resist public health guidelines [2,26].Disregarding public health practices, such as wearing masks, maintaining social distance, and getting vaccinated resulted in significant societal costs.Between November and December of 2021 alone, over 692,000 preventable hospitalizations were reported among unvaccinated patients, leading to a staggering $13.8 billion [31].Soaring COVID-19 infection cases put a massive burden on healthcare systems, depleting medical resources and contributing to severe employee burn-outs and shortages of healthcare workers [53].Meanwhile, COVID-19 conspiracies and hyper-partisan news on social media led to nationwide protests, obstruction of medical facilities [6], and even fatal assaults of employees requesting customers to wear masks [8].
According to fuzzy-trace theory (FTT), texts that clearly establish cause-and-effect relationships facilitate humans extraction of gist mental representations, helping people understand and remember information better than texts without any causal coherence [83,87].This aligns with previous studies in decision sciences, which have shown that causal coherence of gists in texts plays a crucial role in how individuals perceive risks and make healthrelated decisions [24,88].Throughout the pandemic, social media conversations refuting COVID-19 public health practices based on mis-/disinformation and identity politics continued to obscure people's knowledge of safe health practices, making well-informed health decisions extremely difficult [113].Using evidence-based theories like FTT allows us to create psychologically descriptive models that transform language into analyzable units shown to predict human behavior [24,85].
In this paper, we leverage the capabilities of prompt-based Large Language Models (LLMs) to delve into the nuanced language patterns in social media discussions opposing COVID-19 public health practices through a theory-driven approach using fuzzy-trace theory.By leveraging prompt-based LLMs, we dissect the language around resistance to COVID-19 health practices through the lens of FTT and its central concept of gist.Specifically, we examine how causal language patterns or gists manifest across social media communities that denounce pandemic health practices, contribute to trends in people's health decisions, and by extension, impact national health outcomes.We divide our work into four main studies to address the following research questions: • RQ1.How can we efficiently predict gists across social media discourse at-scale?(Study 1) • RQ2.What kind of gists characterize how and why people oppose COVID-19 public health practices?How do these gists evolve over time across key events?(Study 2) • RQ3.Do gist patterns significantly predict patterns in online engagement across users in banned subreddits that oppose COVID-19 health practices?(Study 3) • RQ4.Do gist patterns significantly predict trends in national health outcomes?(Study 4) We answer RQ1 by leveraging LLMs and their prompt-based capabilities to identify gists in social media conversations at-scale (Study 1).We do so by developing a novel prompting framework that detects and extracts cause-effect pairs in sentences from a corpus of online discussions collected from banned Reddit communities known for opposing public COVID-19 health practices.Study 1 allows us to identify the causal language (cause-effect pairs that form gists) that underlie how people argue against COVID-19 health practices on social media.We answer RQ2 by clustering sentence embeddings of gists (sentences with causal relations identified from Study 1) to identify the most salient gist clusters, and demonstrate how they evolve across key events (Study 2).Finally, we answer RQs 3 and 4 by using Granger-causality to test whether causal discourse (gists) on social media can significantly predict online engagement patterns (Study 3), and trends in national health decisions and outcomes in the U.S. (Study 4).
Contributions: This work's intellectual merits are methodological and theoretical.The computational techniques introduced in this work enable efficient and scaled prediction of gists on social media, and thus can be used to better identify and understand underlying mental representations that motivate health decisions and attitudes towards public health practices (Study 1).The clustering and evolution of gists in Study 2 identify the most salient themes associated with how people causally argue against pandemic health practices online.Patterns in gist volumes across cluster topics fluctuate closely with topically-related high-profile events, including federal health announcements, congressional policies, and remarks by a country's leader.Study 3 empirically confirms how gist volumes significantly drive subreddit engagement patterns (upvotes and comments), providing implications for how causal language may play a role in monitoring conversations in content-moderation practices of controversial online health communities.Finally, gist patterns within subreddits that support anti-pandemic health practices were significantly interrelated with nationwide trends in important health decisions and outcomes (Study 4).To the best of our knowledge, our research is the first to empirically establish Granger causality between linguistic patterns in social media discussions about COVID-19 health measures and real-world trends in public health outcomes.Our work entails the following contributions: • The task of accurately predicting causal language patterns and generating coherent gists (causal statements) is a complex challenge [65,86].We overcome this by introducing a multi-step prompting framework: Role-Based Incremental Coaching (RBIC).RBIC is a prompting mechanism that allows efficient prediction of gists across social media conversations at-scale.RBIC integrates role-based cognition with effective learning in sub-tasks to enhance the model's overall understanding of a given task prior to generating a final output.We overcome prior challenges in detecting subtle and complex expressions of semantic causality in noisy text by leveraging RBIC.By doing so, this work advances stateof-the-art approaches in detecting gists at-scale, yielding a novel, psychologically relevant, and efficient technique for identifying and examining bottom-line meanings in massive amounts of textual data.• We demonstrate the novel application of prompt-based LLMs in advancing computational social science (CSS) methods in Human-Computer Interaction (HCI) research.Generic Natural Language Processing (NLP) models and LLMs typically lack multi-step reasoning capabilities [116].This limitation makes it difficult to apply such models in performing nuanced and complex text analyses in CSS research [123].By applying RBIC, we overcome this limitation and demonstrate the versatility and effectiveness of prompt-based LLMs in identifying and synthesizing nuanced linguistic patterns.
In so doing, we contribute to broadening the potential application of prompt-based LLMs for theory-driven textual analysis in CSS research in the HCI domain.• Our research enhances the analytical depth and scope of insights into the causal discourse surrounding people's opposition to public health practices on social media.We identify the most salient gist clusters that embody the core topics at the center of how and why people oppose public health practices throughout COVID-19, from May 2020 to October 2021.We use sentence embeddings and clustering to provide a characterization of how the volume of gists across each topic fluctuates in relation to key events associated with the core topics embodied by the gist clusters.By doing so, we capture how causal online discourse surrounding anti-COVID-19 health practices evolves over time across real-world events.Such insights can, in turn, inform timely public health communication strategies and interventions that account for ongoing current events [85].• Finally, we address the question of whether and how social media language patterns in the form of gists influence nationwide trends in vaccinations, COVID-19 cases, and hospitalization in the U.S., providing new evidence around how important health decisions and national health outcomes are impacted by causal linguistic signatures across social media health discussions-an important link that has not been empirically established at-scale in prior research.

RELATED WORK 2.1 Understanding the Impact of Social Media Language Patterns on Health Decisions and Outcomes
The COVID-19 pandemic has ignited an unprecedented increase in social media discourse on health decisions and practices [113,122], spurring a wave of computational social science research [39,104] aimed at understanding this phenomenon in the field of HCI [63,77] and CSCW [16].Using text mining and computational linguistics, researchers have analyzed pandemic-related social media discourse through the lens of mental health [75], political views [17,90], attitudes towards vaccines [79,119], misinformation [49,73,99], and perceptions of health policies and government institutions [41].Such studies have uncovered key insights on how language patterns reflect people's beliefs [109], sentiments [54], and emotional wellbeing [10,118] during Covid-19.For example, researchers have examined collective shifts in the public mood in response to the evolving pandemic news cycles by analyzing the daily sentiment of tweets [105].Similarly, others have analyzed social media posts containing a subset of depression-indicative n-grams to track the fluctuation in mental health of social media users over the course of the pandemic [39].
While such studies have made valuable contributions to understanding the role of language patterns in health-related discourse on social media [9,30], there remains an opportunity to explore their impact on real-world health decisions and outcomes.To the best of our knowledge, there has been a lack of research that examines how social media discussion patterns surrounding health practices can predict patterns in health decisions and outcomes in the real world.Our research aims to fill this gap.Some emerging research, such as the study by Nyawa et al. (2022), has started to explore this link by applying computational linguistics to categorize individuals as either vaccine-accepting or vaccine-hesitant based on their online language patterns [71].Yet, the majority of empirical studies examining the impact of social media discourse on real-world behavior thus far have leaned heavily on survey-based methods [78,118].These surveys often depend on self-reported metrics about social media use and health behaviors, thereby offering only a limited perspective on the complex relationship between social media discourse patterns and actual health decisions.This limitation underscores the existing challenges in understanding how health-related discussions on the internet translate into or shape real-world outcomes and decisions [7].Our research aims to address this challenge by investigating how language patterns in social media conversations can serve as predictive markers for understanding real-world trends in people's health decisions and outcomes during the pandemic.

Understanding Health Discourse Through
the Lens of Fuzzy-Trace Theory and Its Core

Concept of Gist
Scholars have used fuzzy-trace theory (FTT) as a theoretical lens to explore risk perceptions and decisions underlying health practices and discussions in various contexts, including vaccines [113], cancer [115], HIV-AIDS [114] and the prescription of antibiotics [52].These studies support FTT's core tenet that gists are stronger and more effective forms of communication than verbatim representations in the sense that they are (a) better remembered and (b) more likely to influence decisions [83,87].For example, a study comparing articles on vaccines posted on Facebook showed that those containing gists (e.g., bottom-line meaning) are shared 2.4 times more often on average than articles with verbatim details (e.g., statistics) [15].Having a story or images did not add unique variance to predictions once gist was accounted for.The study's results show that communications about vaccines are more widespread when they express a clear gist explaining the bottom-line meaning of the statistics rather than just the data themselves.Likewise, scholars have also used FTT as a theoretical framework to examine people's behavior across diverse contexts, such as law, medicine, public health, systems engineering, and HCI [61,88,124].For example, in HCI, researchers have used FTT to examine people's behavior in online social tagging [93] and to improve speech-to-text interface design through gist-based communication [65].Others have used FTT in designing a web-based intelligent tutoring system for communicating the genetic risk of breast cancer through gists [115].Overall, FTT's theoretical breadth and empirical support as a cognitive explanation of how people process and communicate information related to health decisions makes FTT a well-suited theoretical lens to examine resistance towards public health practices in our research.Further, gists that causally link some event, actor, or outcome tend to facilitate more effective uptake of information than those that are less causally coherent [57,85].In fact, causal coherence is one of the most important semantic aspects of gists that make gistbased communications effective [40].For example, in a study analyzing 9,845 vaccine-related tweets, researchers discovered that tweets containing explicitly causal gists (e.g., "vaccines cause autism") were far more likely to be retweeted and to go viral.This was in contrast to tweets that suggested a link between vaccines and autism but emphasized details and lacked a meaningful causal connection [15].Simply, information with stronger causal structure produces more meaningful gists in people, who then are more likely to remember, apply, and share that information [86].Fuzzy-trace theory draws on psycholinguistic research on mental representations of narratives that underlies both human memory models and computational models in which causal connections are a central feature of common gists [89,103].Hence, we focus on causal gists, or gists that contain a cause-effect relation.From hereon, we refer to causal gists as gists.

Challenges in Predicting Semantic Causality in Online Health Discourse
Extracting cause-effect relations in text is one of the many open challenges in NLP research that has seen significant breakthroughs in recent years through the development of generative Large Language Models [120].However, computational social science research has yet to take advantage of these advancements [123], particularly in examining gists related to health practices.For example, scholars have used topic modeling, such as Latent Dirichlet Allocation (LDA) [11] to identify gists in vaccine hesitancy [40].While useful, these methods do not enable granular detection of gists at the sentence or phrase level.For instance, LDA only allows the detection of gists at the corpus level, where each identified topic across the entire dataset is treated as a proxy identification of one gist.Recent scholarship in medical informatics has examined health-related attitudes in social media by extracting causality through machine learning approaches with rule-based dependency parsing and named entity recognition [19,29,67,68,80].While such approaches are an improvement, they can only detect intra-sentential (within a single sentence) and not inter-sentential causality where cause and effect lie in different sentences (e.g., God made us to breathe naturally.I won't be forced to wear masks.).More recently, transformer models such as InferBERT and CausalBERT, specifically designed for extracting causal relationships, have yielded more promising results [50,111].However, the token limit of these models significantly reduces performance when dealing with longer texts [4].Additionally, like humans, these models struggle to discern subtle forms of semantic causality in noisy or incoherent data.Our research aims to not only identify causality in text, but also generate coherent gists based on the identified cause-effect pairs.To achieve this, we address prior limitations by leveraging recent advancements in pretrained LLMs and their prompt-based approaches to develop a novel prompting framework to systematically predict gists [112].

STUDY 1: PREDICTING GISTS IN SOCIAL MEDIA CONVERSATIONS AT-SCALE
As a first step to analyzing how causal language patterns on social media impact health decisions and outcomes, we leverage the power of prompt-based LLMs in Study 1. Specifically, we develop and apply a multi-step prompting framework called Role-Based Incremental Coaching (RBIC) to efficiently predict gists across social media discourse at-scale.Role-Based Incremental Coaching is a prompting framework (Fig. 2) built with few-shot demonstrations using GPT-4, which consists of two primary prompting techniques: Role-Based Knowledge Generation and Incremental Coaching.Combined together, RBIC allows the model to (1) learn its role for a given task by generating role-specific knowledge as a task-performing agent and (2) perform a series of small sub-tasks to refine its understanding and quality of the final output by incrementally building upon the sub-task responses.RBIC allows us to systematically identify the presence of semantic causality in a given post, and generate causally coherent gists across large volumes of textual corpora at-scale.

Method: Role-Based Incremental Coaching (RBIC)
Role-Based Knowledge Generation.Drawing inspiration from prior NLP research that leverages multi-step reasoning capabilities in LLMs [58], we developed Role-Based Knowledge Generation as the initial grounding component of our prompting framework.Before producing a final response from LLMs, asking LLMs to generate potentially useful information about a given task improves the final response [58].For example, as shown in an open online Is there a cause-effect relationship in this given sentence?-If yes, just answer: "Yes" -If no, just answer: "No" -Don't give me any explanations Indeed, there is a cause-effect relationship in the given sentence.
Then extract the corresponding cause phrase and effect phrase in the given sentence.

Please explain why
Generate a reasonable and clear causal gist based on {"Cause": "took the vaccine", "Effect": "really sick now"} and your understanding of the sentence with the cause-effect relationship.

Certainly! A cause-effect relationship is a relationship between two events or variables…
Of course.The term "causal gist" refers to the fundamental meaning or essence of a sentence or text that expresses a causal relationship… Yes Sure: {"Cause": "took the vaccine", "Effect": "really sick now"} Taking the vaccine yesterday caused the person to become sick.course "Learn Prompting" 1 , when prompted with "Which country is larger, Congo or South Africa?",GPT-3 answers incorrectly.However, when the model is prompted to "Generate some knowledge about the sizes of South Africa and Congo", before answering the final question, the model uses the output to the intermediate prompt ("South Africa [has] an area of...") to generate the correct answer: Congo is larger than South Africa.We leverage this prompting intuition in Role-Based Knowledge Generation to enhance the model's understanding of its role as a task-performing agent.By doing so, the model can achieve better performance by accessing potentially relevant contextual information, as shown in prompts, P1 and P2 (Fig. 2).The corresponding outputs to P1 and P2 -O1 and O2, respectively -are then integrated with a task-specific prompt (P3) in the following step.The role-based knowledge outputs (O1 and O2) allow the model to perform tasks more accurately given its enhanced understanding of its specific role for achieving the task.
Incremental Coaching.Inspired by Chain of Thought (CoT) [112], Incremental Coaching is a technique within the Role-Based 1 https://learnprompting.org/ Incremental Coaching (RBIC) framework that involves breaking down a complex task into smaller, manageable sub-tasks as shown in P3-P5 in Fig. 2. The role-based agent is coached through a series of sub-tasks in a step-by-step manner, with each sub-task building upon the previous one.To implement Incremental Coaching effectively within RBIC, it is necessary to follow a logical sequence of sub-task prompts that allows the model to build understanding and confidence in performing the final task by generating incremental outputs (O3-O4).By breaking down the final task into a series of incremental sub-tasks, the role-based agent can gradually improve its comprehension of the final task to deliver a more accurate final response.
Application of RBIC.Here, we demonstrate the algorithmic conceptualization of the RBIC prompting framework in the context of generating gists.The essence of the Role-Based Incremental Coaching (RBIC) framework lies in its two core algorithmic components: Role-Based Knowledge Generation and Incremental Coaching, as shown in Algorithm 1.The RBIC algorithm requires the following inputs: Update KB with   12: end for 13:  ← RBA.FinalOutput(, ) 14: return • User Input : The RBIC is initialized by the user input.For example, in our study, we operationalized user input as  = (1, 2, 3, 4 or 4, 5), as shown in Fig. 2. • Role-Based Agent: Essentially, this can be any prompt-based LLM.For our study, we used GPT-4 as our Role-Based Agent.Next, the RBIC algorithm will generate the following output: • Knowledge Base (KB): The first phase of the RBIC algorithm, denoted as Role-Based Knowledge Generation, is symbolized by the function RBA.GenerateKnowledge().In this step, the Role-Based Agent (in our case, we use GPT-4, but this can be substituted with any prompt-based LLMs) is prompted with a user input  to elicit relevant background knowledge .This knowledge forms the basis for task execution and is stored in an initial Knowledge Base (KB).
Here,  represents the knowledge generated, and  represents the user input posed by the user.← signifies the assignment of generated knowledge  to the Knowledge Base (KB), thus creating a dynamic knowledge architecture that adapts over time.For instance, in our study,  comprised of O1 and O2 (as shown in the upper right of Fig. 2), which collectively formed our Knowledge Base (KB).• Final Task Output (F): The subsequent phase, known as Incremental Coaching, is predicated on a sequence of sub-tasks { 1 ,  2 , . . .,   } and their corresponding outputs: In this phase, each sub-task   leverages the updated Knowledge Base (KB) to produce an output   .  is then used to update the KB, thus iteratively coaching the model through a series of sub-tasks in a step-by-step manner.Breaking down the final Complex Task  into simpler sub-tasks   allows the model to incrementally build up the necessary knowledge and skills to tackle the final task.Therefore, this incremental knowledge building across sub-tasks enables the model to better understand and perform the final Complex Task  .In our case, our Complex Task ( ) generates a "gist" based on the cause-effect pairs.The individual sub-tasks that contribute to this complex task are labeled as P3, P4A, P4B and P5 (Fig. 2).The algorithm proceeds sequentially, producing intermediate outputs O3, O4, and ultimately culminating in O5, which is the gist generated from the cause and effect pairs identified in sub-task P4A.
When applied to predicting gists in social media conversations, RBIC instructs the model to understand the concept of cause-effect relations as a task-performing agent.The model then incrementally performs sub-tasks to recognize and extract cause-effect pairs, and finally generates a concise gist that captures the essence of the identified causal relationship.

Human Evaluation.
To assess the effectiveness of RBIC's application in predicting gists in our data, we conducted a human evaluation of the RBIC-generated outputs.We recruited 6 human evaluators to evaluate the presence of causal coherence (O3), causeeffect pairs (O4), and gists (O5) for each Reddit post based on the following criteria: • Accuracy (classification): Is there a cause-effect relationship in the post (1/0; Yes/No)?• Relevance (extraction): How well does the cause-effect pair capture the primary causal relationship in the post (1-5; not well at all, slightly well, moderately well, very well, extremely well)?• Conciseness (generation): How well does the gist concisely summarize the cause-effect relationship in the post (1-5; not well at all, slightly well, moderately well, very well, extremely well)?
To mitigate error propagation, the evaluation was designed as a sequential process but with checks for accuracy.First, evaluators focused on 'Accuracy', verifying the presence of a cause-effect relationship.Second, 'Relevance' was examined to ensure the identified cause-effect pairs accurately reflected the post's main causal relationship.The final and third evaluation stage, 'Conciseness', was only evaluated in posts that had already met the 'Accuracy' and 'Relevance' criteria.This approach minimized propagation of errors from earlier stages.
The accuracy criteria assesses the model's performance in identifying the presence of a causal relationship in a post.Relevance evaluates the model's ability to correctly extract the cause and effect phrases that are most salient to the core message of the post's content.Conciseness assesses the model's generative performance in concisely synthesizing a coherent gist based on the identified cause and effect phrases.In total, each of the 6 annotators evaluated 3,100 posts that were randomly selected from the entire dataset.For each criteria, each post received three evaluation scores from three annotators.The evaluators' assessment of the model's performance across the three criteria were generally high based on inter-rater agreement scores using Fleiss kappa () [32]: accuracy ( = 0.892); relevance (mean = 4.3,  = 0.839); conciseness (mean = 4.5,  = 0.864).

Result
Table 1 presents the results of RBIC's application, demonstrating the effectiveness of our prompting framework in predicting gists at-scale.We identified a total of 6,861 gists in our data.As shown, RBIC cannot only detect semantic causality (O3), but also extract verbatim phrases corresponding to the main cause-effect pairs (O4), and generate coherent gists (O5) based on the identified pairs.In the first example, RBIC detects sentences where causality is implied with nuance, as well as those that are more explicitly stated.
Although most of the gists accurately capture the semantic essence of the causal relationship, some are more eloquent than others.For instance, the gists in examples 2 and 4 use sentence inversions, beginning with "the cause of", while others are more semantically fluid.We also performed a comparison using fine-tuned language models (BERT, RoBERTa and XLNet), as detailed in the appendix (Table 6), which showed that RBIC outperformed the baseline models in extracting cause-effect pairs (O4) by 26.6% in F1-score when comparing RBIC to the best-performing baseline model (RoBERTa with 0.814 F1-score).

STUDY 2: HOW GISTS EVOLVE OVER TIME
Given the rapidly evolving public health discussions on social media, it is crucial to examine how they evolve over time [28,38].
This enables a better understanding of shifts in public opinion and emerging concerns across contentious debates around health practices like vaccinations, mask-wearing, and social-distancing [33].Hence in Study 2, we build upon our Study 1 findings to address: What kind of gists characterize how and why people oppose COVID-19 public health practices?How do these gists evolve over time across key events?To answer these questions, we extract sentence embeddings from each of the gists identified from Study 1, and cluster the embeddings to identify distinct gist clusters that characterize the core topics at the center of how people argue against COVID-19 health practices.

Method
4.1.1Extracting Sentence Embeddings from Gists.To identify the most salient topics across the causal language (gists) surrounding the social media discourse against public health practices, we use Sentence-BERT (S-BERT) to extract semantically rich representations of the gists identified in Study 1. S-BERT is a transformerbased model designed to produce contextualized sentence embeddings, which are particularly valuable in clustering texts [82,101].
After preprocessing the gists with standard text cleaning operations (lowercasing, removal of special characters, tokenization), we implemented S-BERT using the SentenceTransformer to extract embeddings from our gists.The S-BERT model comprises 12 hidden layers, with each layer producing an output representation of 1(N ) ×768(M) dimensions.To obtain high-quality embeddings, we extracted the output representations from each of the last three hidden layers of the model (layers 10-12), and computed their means.By doing so, we are able to capture and generate semantically rich representation of each gist as high dimensional vectors.
4.1.2Clustering of Sentence Embeddings.After obtaining the sentence embeddings, we applied Principal Component Analysis (PCA) [13] to reduce the dimensionality of the embeddings prior to the clustering step.This was done to better visualize the language embeddings in a lower-dimensional space and to facilitate a more effective interpretation of the embedding results.We selected PCA as our method, given its frequent use and proven effectiveness in reducing dimensionality, especially for language embeddings [97].We used k-means [69] for clustering, as it is especially reliable for clustering semantic word representations [121].The k-means algorithm iteratively assigns each embedding to a cluster with the closest centroid, and updates the centroid by calculating the mean of the embeddings assigned to the cluster [121].This process continues until the centroids stabilize.To enhance the reliability and robustness of our clustering approach, we incorporated sentence embeddings of posts that did not contain any gists, following Samosir's study [92].This step allows us to assess the quality of sentence embeddings by verifying that embeddings from sentences that do not contain gists cluster apart from embeddings derived from gists.Finally, we used the elbow method [70] to determine the optimal number of clusters by calculating the sum of squared errors (SSE) in ascending order of cluster numbers until additional clusters resulted in diminishing returns [64].
4.1.3Verifying Gist Clusters.The first author initially identified the primary themes of each cluster through categorization, screening, and summarization of 200 randomly selected gists from each cluster.Next, we recruited 6 annotators to manually evaluate and verify five primary gist clusters, as shown in Table 3. Annotators manually evaluated the clustering results by iteratively examining and discussing the themes across 200 randomly selected gists belonging to each cluster (1/0).See annotation agreement in Table 2.
The verification process also includes two additional steps: (1) refining cluster descriptions such that they were thematically salient and representative of the core ideas and topics embodied by the gists in each cluster and (2) examining the sentences in the non-gist cluster that did not include any gists (non-gist cluster is C6 ).

Result
A representative sample of gists from each cluster is presented in Table 3, illustrating the core topics that characterize the opposition discourse surrounding pandemic health practices.In Fig. 3, (top panel), we visualize the evolution of our gist clusters across four time points (May 2020 -October 2021).
The visualization reveals interesting relations between the clusters.For instance, cluster 4 , which embodies gists discussing the impact of COVID-19 on the economy and society at-large wraps around cluster 3 gists related to the impact of lockdown policies.This spatial proximity suggests that causal discussions on the broader consequences of the pandemic are closely intertwined with gist-based conversations on the impact of lockdown measures, shedding light on the interconnectedness of how people talk about these two topics in a causal manner.Similarly, clusters 1 and 2 are not only close in proximity, but also similar, in terms of position and shape: both clusters are diagonally positioned from top left to the bottom right, and run parallel to each other.Given that both clusters represent gists concerning specific health practices (vaccinations, mask-wearing), it is likely that these topics may share similarities in the causal manner in which people talk about the effectiveness of such health practices.Cluster 5 , which represents gists related to conspiracy theories, domestic politics, and foreign countries appears to lack a clear boundary and is spatially dispersed compared to other clusters.This could be due to the fact that cluster 5 encompasses multiple topics, as indicated by its description, in contrast to other clusters that are more uniformly focused on specific health practices, government measures, or particular aspects of the pandemic's impact.The lower inter-rater reliability agreement for cluster 5 (Fleiss  = 0.821) further supports the notion that it is a heterogeneous cluster consisting of various topics compared to other clusters.
The bottom half of Fig. 3 demonstrates that peak volumes of gists within each cluster align closely with key events related to the respective topics embodied by those clusters.To identify these key events, we relied on reports from major health organizations including the Centers for Disease Control and Prevention (CDC), World Health Organization (WHO), and United Nations (UN) for announcements related to public health interventions like lockdowns and vaccine rollouts [48,107].News reports from these organizations were widely recognized as authoritative information sources across the global community.Hence, we used such announcements and reports from these sources that highlighted key pandemic events, public health announcements, and significant milestones across the timeline of COVID-19.
In addition to these organization reports, we also analyzed articles related to COVID-19 published by major news outlets, such as AP News, Reuters, CNN, Fox News, Wall Street Journal, New York Times, and NPR.We then identified highly mediatized events by using the number of shares and article comments.This process also entailed iterative discussions among all the authors to ensure a comprehensive and balanced selection of events.Our approach aimed to minimize biases by incorporating a diverse range of sources and validating the significance of events through multiple indicators such as media coverage intensity and public engagement.
For example, cluster 1 peak occurs in November 2020, coinciding with the country's initial phase of vaccine distributions to healthcare workers and high-risk groups [48].Similarly, cluster 2 gists (mask-wearing) peaks in June 2021, the same month in which the federal mask mandate is lifted [107].Cluster 3 gists, which relate to the impact of lockdowns, peak in May 2020, by which approximately 4.2 billion or 54% of the world's population was under lockdown [42].In December 2020, the U.S. Congress passed a bill to distribute $90 billion in stimulus checks to households, as nearly 30 million American adults reported food and income insecurity in the same month [106].These events temporally coincide with the peak in cluster 4 gists, which concerns the socioeconomic consequences • The implementation of a vaccine mandate has resulted in people losing their jobs.
• The use of experimental COVID vaccines is causing an increase in COVID deaths.
• The vaccine was ineffective against new variants, which led to the death of 7,000 people who received the spike protein mRNA jab, including little kids.This suggests that the vaccine was administered for no reason, as it failed to provide protection against the new variants.

C2. Controversies Related to Masks-Wearing Practices
• If a person refuses to wear a mask at a business for medical reasons, the business may deny them services.
• The lifting of mask mandates for vaccinated individuals has caused the proliferation of a deadly biohazard, which could lead to the CDC and other agencies being charged with involuntary manslaughter.• Wearing masks prevents people from seeing each other's faces, which leads to difficulties in understanding and building trust with others.

C3. Impact of Lockdown
• The lockdowns have caused tourism-dependent islands in Thailand to suffer from a lack of income, leading to a situation where they have been on food aid for over a year.• The lockdowns caused a loved one to almost commit suicide, highlighting the negative impact of lockdowns on mental health.• The prolonged lockdown imposed by Cuomo for six months has resulted in the inability of the speaker to pay their bills.

C4. Societal and Economic (Macro) Impact of COVID-19
• The outbreak of COVID-19 has caused people to struggle with their livelihood, leading to financial difficulties and economic instability.• The COVID-19 pandemic has caused the biggest drop in US life expectancy since the second world war.
• The COVID-19 shutdowns have resulted in 1 in 5 churches facing permanent closure within 18 months due to the financial strain caused by the pandemic.

C5. Conspiracy Theories, Domestic Politics, Foreign Countries
• People refuse to share a table or work with certain people because they see "certain people" as sub-human because of their vaccination status.• The sentence suggests that if COVID-19 was intentionally released, it would lead to a major benefit for China and billionaires.The implication is that the cause of COVID-19's intentional release would be to bring about this benefit for these parties.• The lack of information on the epidemic from people on whether they think something is safe or not is preventing the speaker from being able to debate with their conspiracy theory friends.
of the pandemic.Finally, cluster 5 gists, which are related to conspiracy theories, politics, and foreign countries, reached their peak volume in August 2020, around the time when President Trump retweeted a popular online conspiracy theory [100] and referred to the "China virus" in his White House briefing [12].Our findings imply that trends in gist volumes are linked with real-world events.

STUDY 3: HOW SOCIAL MEDIA GIST PATTERNS INFLUENCE ONLINE ENGAGEMENT BEHAVIOR
Delineating key semantic patterns (e.g., gists) that drive online behavior can help gain insight into how social media language impacts the dissemination of health information online.This, in turn, can better inform public communication strategies for time-sensitive health interventions.Hence, in Study 3, we use Granger-causality to examine the extent to which gist patterns influence online engagement, such as up-voting and commenting in subreddit communities that oppose COVID-19 health practices.H1.The daily volume of gists in cluster  significantly Granger-causes the upvote ratio of Reddit posts containing gists in cluster .H2.The daily volume of gists in cluster  significantly Granger-causes the number of comments associated with Reddit posts containing gists in cluster .

Method and Analysis
First, we constructed the time series data, T  for each cluster, where T   represents the daily number of gists in cluster i spanning from May 2020 to October 2021.We then created two more temporally corresponding time series data, T  and T  , which represent the daily upvote ratio and the daily comment count for each Reddit post containing gists from cluster , respectively.We conducted a total of 20 Granger causality tests (5 clusters × 4 hypotheses -H1, H2, H1R, H2R), using time lags ranging from 1 to 14 days.To ensure that the value of the time series was not merely a function of time, we conducted the Augmented Dickey-Fuller (ADF) test [21] using the serial difference method to achieve stationarity with ADF test values exceeding the 5% threshold.

Results
Table 4 shows significant Granger causal results ( < 0.05).Gists across certain topics are significantly predictive of up-voting and commenting patterns, and vice-versa, in banned subreddits that oppose pandemic health practices.Specifically, the daily volume of gists significantly forecasts up-voting and commenting behavior across the topic of vaccines ( cluster 1 ), mask-wearing ( cluster 2 ), and macro-impacts of the pandemic ( cluster 4 ) with significant lag lengths ranging from 2-7 days.These results align with prior research highlighting the linguistic power of gists in spreading online information.The reverse (H1R and H2R) is true for gists discussing the impact of lockdowns ( cluster 3 ): up-voting and commenting behavior both significantly forecast fluctuations in the volume of lockdown related gists.
3.1 Bidirectional Causality: Notably for cluster 2 , which pertains to controversies and policies related to masks-wearing, we observe an interesting feedback loop between gist volumes and commenting behavior.As the volume of gists related to mask-wearing practices increases, corresponding online engagement around posts containing such gists, also increases in the form of up-votes.This behavior, in turn, further influences the volume of gists that are topically related to mask-wearing practices.In other words, there is a mutually reinforcing effect between causal language and online behavior in the context of mask-related discussions.

STUDY 4: HOW SOCIAL MEDIA GIST PATTERNS INFLUENCE NATIONWIDE TRENDS IN HEALTH OUTCOMES
In Study 4, we address the question of whether and how social media language patterns in the form of gists influence health decisions and outcomes in the U.S. We follow Study 3's application of Granger causality to examine the relationship between gists patterns and important health decisions and outcomes related to COVID-19 in America.Considering the extensive attention the subreddits we analyzed received from the American public and the media [43], we focus on U.S. health outcomes.

COVID-19 Data on Health Outcomes
We used the following data from Our World in Data 2 , a trusted source for COVID-19 health data for our analysis: • Number of Vaccinations (NV): the total number of COVID-19 vaccine doses administered on a given day.

Hypothesis Testing with Granger Causality
Following Study 3, we Granger-test the relationship between the daily volume of gists and patterns in people's health decisions (vaccinations) and national health outcomes (General/ ICU Hospitalization, Total/ New Daily COVID-19 Cases) through H3 and its reversed variation (H3R): H3.The daily frequency of gists (Cluster ) significantly Granger-causes people's health decisions and/or national health outcomes, where  ranges from 1 to 5. H3R.People's health decisions and/or national health outcomes significantly Granger-causes the daily frequency of gists (Cluster ), where  ranges from 1 to 5.
We created five time series data, T  , T  , T  , T  , T  , corresponding to the five health outcome data described above.We temporally align our data with the time frame for Studies 1-3.We performed 25 Granger causality tests (5 clusters × 5 health outcome data) with a range of lag times from 1 to 14 days.We conducted ADF tests using the serial difference method to ensure statistical robustness.
Table 5: Result of Granger causality test for relationships between Reddit discussion clusters (C1-C4) and health outcomes dataset.Cluster 5 is not included due to the absence of significant Granger causality findings.Notes: See Appendix A for complete statistical results (Table 7 and 8

Results
Table 5 show shows significant Granger-causal results with corresponding lag lengths ( < 0.05).We summarize our findings below.
Causal Talk Around Vaccines and National Vaccination Trends are Bidirectional.Our results demonstrate bidirectional causality between causal discourse patterns related to vaccines and the number of vaccinations administered in the U.S. The daily volume of cluster 1 gists, which consists of causal arguments related to vaccine regulations, efficacy, and side effects, is predictive of vaccination patterns across the U.S., and vice-versa.However, there is a difference in the lag lengths between H3 and H3R.It takes 4 days for gist patterns to influence vaccine adoptions (H3), while it takes two weeks for vaccination trends to shape how people talk about vaccine-related topics in a causal manner (H3R) across COVID-19 subreddits known for vaccine skepticism.In addition to a more significant Granger-causal relationship, we also observe a higher Pearson correlation for H3 ( = 0.413,  = 0.005) compared to H3R ( = 0.105,  = 0.028), indicating that national vaccination patterns have a greater impact on shaping vaccine-related causal language on social media than the other way around.There are two possible explanations: first, as more people get vaccinated, online discussions on the experiences and potential side effects of vaccines may become more prevalent -leading people to talk in a causal manner about the side effects of vaccines (e.g., "Had my Pfizer jab last Wed and have felt like death since").Another possible explanation is that the increasing vaccination requirements by corporations and governments as a condition for work or travel (and therefore, nationwide uptick in vaccinations) during the pandemic may have compelled vaccine-skeptics to argue more vehemently against vaccines [25].Previous research has shown that vaccine skeptics are susceptible to confirmation bias, as are most individuals, such that initial beliefs lead to polarization [66].That is, vaccine skeptics are likely to seek out and discuss information about vaccines that confirms pre-existing beliefs when presented with opposing information or situated in contexts that challenge their views.Our findings align with this research, suggesting that as national vaccination uptake increases, vaccine skeptics might increasingly argue against vaccines in a causal manner (e.g., "If you take the vaccine, it's probably because you're unhealthy."),as commonly expressed in posts that contain cluster 1 gists.
Causal Talk Around Mask-Wearing Practices Significantly Predicts Trends in COVID-19 Cases.Our Granger-causal results show that national health outcomes, such as the total and new daily COVID-19 cases can be significantly predicted by the volume of mask-related gists ( cluster 2 ) with a lag of 5 days.The mask mandate was one of the most controversial health practices that impacted people of all ages and occupations during the pandemic [62,98].Parents were polarized over school mask requirements to the extent of resorting to violence [94].Employees who asked customers to wear masks were physically assaulted [8].Although people initially adhered to wearing masks, more individuals started to protest mask mandates both on and offline, citing physical distress ("If having healthy lungs is important for COVID, why would we wear masks that reduce lung function?" ) or invasion of personal rights: "They will call you a 'coward' or 'scared' for not wanting an intrusive mask over your face (for no reason)", as exemplified by posts containing cluster 2 gists in our data.Over time, the proliferation of anti-mask views, followed by extreme resistance as demonstrated by violent altercations and wide-scale protests across the nation, may have led people to abandon mask-wearing practices [36], which in turn may have led to an increase in COVID-19 cases within a relatively short time-frame of 5 days, as indicated in our results.
Rising Hospitalization Trends Prompt Causal Talk on Lockdown Impact.Our findings show that nationwide trends in the number of patients hospitalized in both general and intensive care units significantly prompt more gists discussing the impact of lockdowns with a lag of 9 and 14 days, respectively (Table 5).Nationwide lockdowns were implemented to curb steep rises in COVID-19 cases and hospitalization rates.In fact, some posts containing cluster 3 gists often explicitly link lockdowns with hospitalizations: "The main reason for implementing restrictions or lockdowns was to prevent ICUs from overflowing." Despite its necessity and intended benefit as a public health measure, studies have shown that lockdowns significantly contributed to social isolation, decrease in mental health, and rise in domestic violence across the U.S. [18].As the lockdown continued to amplify challenges and problems in people's lives, rising hospitalization trends across the country may have heightened people's fear and distress, leading to more intensified and causal online discourse on the lockdown's impact on everyday life.Such sentiments are clearly expressed across posts containing cluster 3 gists: "People are literally starting to go hungry because of lockdown restrictions"; "The implementation of lockdowns has resulted in more harm than good".
Rising Trends in COVID-19 Cases Prompt Causal Talk on the Pandemic's Macro-Level Impact.Nationwide trends in COVID-19 cases significantly Granger-causes the volume of gists discussing the pandemic's impact on society at large, with a lag of 9 days for both total and new cases.In other words, increasing trends in COVID-19 cases seem to nudge people to talk casually about the macro-level consequences of COVID-19.COVID-19 presented major economic and social setbacks that impacted all aspects of society.Some of these concerns were expressed across posts containing cluster 4 gists that linked the pandemic with economic crises ("The pandemic caused one of the largest economic crises, which in turn led to one of the largest poverty and hunger crises"), decreased life expectancy ("The COVID-19 pandemic has caused the biggest drop in US life expectancy since the second world war"), potentially oppressive public health measures ("The cause of the next deadly pandemic will lead to the implementation of authoritarian prevention measures"), and even racism ("The fact that Covid19 affects people of color more than whites is the cause of the conclusion that Covid19 is racist").With COVID-19 cases rising and situations continuing to remain unpredictable, people may have become more anxious and distressed about the long-term effects on society.Consequently, this may have led individuals to discuss the pandemic's impact in a causal manner on social media, as they try to make sense of its far-reaching consequences on society [84].

DISCUSSION
In summary, our findings underscore RBIC's effectiveness in efficiently predicting social media gists at scale (Study 1), thereby enriching our insight into the underlying mental constructs that shape people's health decisions and attitudes towards public health practices.In Study 2, we cluster and track the evolution of such gists, revealing key themes in online arguments against pandemic health practices.These gist volumes closely align with significant topical events, such as health announcements, policy changes, and leadership statements.In Study 3, we empirically demonstrate how gist volumes significantly drive subreddit engagement patterns (upvotes and comments).Finally, Study 4 reveals the interplay between gist patterns in anti-Covid-19 subreddits and nationwide health trends.We discuss the implications of these findings below.

Harnessing Large Language Models in Computational Social Science (CSS) Research in HCI
Prompt-based LLMs are increasingly used in the CHI community [20,56,76,110], primarily contributing to the development of applications like chatbots [46] and tools for co-writing [56], virtual simulations [108], story-telling [23], and visualization enhancement [96].Such studies have primarily focused on using LLMs as production tools [56] rather than tools for analysis.More recently, computational social scientists in HCI have used prompt-based LLMs for text analyses [34,102,123].However, there remain several challenges for using LLMs in nuanced examination of social media discourse.First, traditional NLP models and commonly used LLMs in CSS research often lack reasoning capabilities [116].For instance, LLMs like BERT-based models, which are extensively used in HCI research that analyze large volumes of social media data [27,60], are typically fine-tuned for specific discrete downstream tasks (e.g., classification).While these pretrained language models have shown promise in performing discrete analyses, some emerging HCI research [116,117] demonstrate the additional value of prompting LLMs to perform multi-step reasoning for a more comprehensive analysis.Building on these prior insights, RBIC aims to enable a more nuanced analysis of social media discourse by leveraging the multi-reasoning capabilities of large language models.To this end, RBIC operates by performing multiple, step-by-step interrelated sub-tasks (question-answering, classification, extraction, generation) prior to generating its final output.This incremental coaching mechanism enhances the model's overall understanding and performance of the final task, allowing us to analyze social media discourse with a more comprehensive and nuanced approach.Second, LLM development paradigms often incentivize researchers to optimize model performance using established evaluation datasets [20,58].While valuable for comparing an LLM's performance with other models, this approach may not result in high performance when applied to new, unseen, in-the-wild datasets [22,123] or with tasks that are slightly different from those that the model was evaluated on [123].As a result, this may limit the potential application of such LLMs for analyzing intricate, heterogeneous in-the-wild data, such as unstructured social media conversations.The role-based cognition component of RBIC addresses this limitation by allowing researchers to define and customize the role of any prompt-based LLM to perform a complex and nuanced language task.By introducing and applying RBIC in the analysis of social media conversations, we demonstrate the versatility and effectiveness of prompt-based LLMs in identifying and synthesizing nuanced linguistic patterns, thus broadening the potential application of prompt-based LLMs for theory-driven textual analysis in CSS research in the HCI domain.

Leveraging Causal Language Patterns in Online Content Moderation Practices
Our results show that the volume of gists across certain topics are significantly predictive of up-voting and commenting patterns, and vice-versa, in banned subreddits that oppose pandemic health practices.For example, daily gist volumes significantly predict up-voting and commenting behavior across topics related to vaccines, masks, and the pandemic's impacts, highlighting the linguistic power of gists in spreading online information as demonstrated in prior literature [85,86].Similarly, our findings show that increasing trends in vaccine adoptions in the U.S. are strongly predictive of the growing volumes of vaccine-related gists in subreddits whose members are generally skeptical of vaccines.While a nationwide rise in vaccine uptake is certainly beneficial, such conditions may present challenging contexts that may reinforce vaccine skeptics to become further entrenched in their views.Vaccine opponents exposed to situations that contradict their perceptions are especially vulnerable to confirmation biases [5], which may lead to an increased tendency to express their anti-vaccine sentiments in online communities in a causal manner, as implied by our findings.These insights underscore the critical role of understanding and monitoring causal language patterns in public health discourse, particularly within online spaces.Current content moderation practices that rely on language models traditionally focus on flagging hate speech or monitoring specific keywords [91].However, our research suggests that monitoring causal language patterns can be a valuable addition to these content moderation practices, especially in controversial online communities where people exchange and learn health information.By leveraging nuanced insights from gists across various health topics, content moderation can become more effective in identifying and managing discussions that may contribute to the spread of online health misinformation or resistance to public health guidelines.

Design Implications for Moderation Dashboard:
Prior studies have shown that the design of a social media platform plays an important role in promoting transparency in content moderation [47].Moderators often fail to articulate what aspect of the content prompted moderation or why such moderation was necessary [47].The approach taken in our study can be built on to effectively inform users about the consequences of their posting behavior, and which aspects of their posts can potentially lead to negative outcomes.The results can also inform design strategies that platforms can undertake to assist moderators in communicating such information to users.
Understanding and identifying causality can be difficult for humans as causality may be expressed implicitly and across sentences or intersententially [93].Currently, there is no automated mechanism for moderators to systematically identify and understand the impact of causal language across online discussions.A design feature in the moderation dashboard, such as the one shown in Fig. 4 (Appendix D), serves as an illustrative example of how RBIC may address this gap.For example, when a moderator clicks on a button called 'Enable Gist Detection (RBIC)', an RBIC-powered extension can automatically scan posts, highlight the cause-and-effect pairs, and identify the overarching gists within the posts.This functionality may also allow moderators to see a list of top gists across community discussions in descending order of gist volumes, and an option to organize these gists based on engagement metrics, including the upvote ratio and comment volume.Additionally, the system may be designed such that the moderator may be able to drill down into posts that pertain to each of these top gists, in which the system can highlight the relevant text spans that pertain to the cause and effect in each post.

Improving Moderation and Community
Guidelines.Identifying posts that do not contain moderator-specified keywords (e.g., profanity) or those that exclude explicit causal language, can still violate community norms or include misleading information in subtle ways [72].Traditional keyword-based filters fall short in identifying such content [45].This can lead to difficulties in setting specific rules for moderation practices, explaining moderation-decisions, or adapting community guidelines during critical times, such as a global pandemic.With RBIC-powered gist detection, moderators can scale the searching of such posts to identify those that reflect common and theoretically predicted disconnects between the public and public health experts.This mechanism can potentially enable moderators to use concrete examples to better explain moderation decisions, as well as improve community guidelines to explain how posts that contain implicit causal narratives may impact people's knowledge and decisions around safe health practices, as shown in our work.

Broader Implications for Understanding
Engagement Patterns Across Online Communities and Offline Health Outcomes During Public Health Crises Our work shows that capturing psychologically important language patterns across social media, in the form of gists, can be useful in predicting human behavior and, consequently, health outcomes.In Study 2, we demonstrate that fluctuations in the volume of gists can significantly predict online engagement patterns, specifically in terms of up-vote ratio patterns (H1) and the volume of comments (H2).This has important implications for researchers studying user behaviors in online communities [37,118].Researchers have shown that the virality of online content is often influenced by a positivity bias in engagement metrics [51], such as up-votes and comments: posts receiving higher engagement are more visible and thus have a greater likelihood of going viral [3,81].This tendency can exacerbate the spread of misinformation, especially during public health crises [1,95].Posts challenging pandemic health practices are often laden with misleading information [55], and online posts embedded with gists are more likely to attract more user engagement compared to those without gists [14].H1 and H2 results demonstrate that such user engagement patterns are predictable through gist volumes, thus highlighting the potential of using RBIC for gist analysis to track and understand the dynamics of how healthrelated content, especially during pandemics, resonates with and influences online user engagement.This insight is crucial for developing strategies to combat misinformation and guide public health communication effectively.Furthermore, HCI research in crisis informatics has contributed to advancing public health monitoring systems by developing tools that track public health outcomes, online engagement patterns, or health-related topics on social media [59,74].Some of these tools that monitor online conversations extract various linguistic aspects from social media discourse, such as sentiment [59] and topical keywords [104].While these advancements have been valuable in providing descriptive insights, most do not go the full distance in linking such linguistic patterns to real-world health decisions and outcomes [55].Our work addresses this gap by demonstrating how RBIC can be leveraged to better connect online conversation patterns to offline health outcome trends.Study 4 results show that online causal talk related to controversial health practices, such as face-masks, are significantly predictive of total and new daily COVID-19 cases across the U.S. Likewise, our findings show that the uncertainty arising from deteriorating trends in national health outcomes may prompt people to increasingly engage online in causal discussions on the pandemic's influence on their lives and society as a whole.For example, nationwide COVID-19 cases and hospitalization patterns significantly drive up the volume of gistbased conversations concerning the pandemic's impact on society, economy, and individuals under lockdown.These findings imply that integrating gist-based language patterns into public health monitoring systems can hold promise for gaining valuable insights into the cognition that underlies skepticism and resistance to public health practices and, by extension, their impact on real-world health outcomes.Integrating RBIC-powered gist detection and real-time analysis of national health indicators into tools can potentially enhance public health agencies' ability to understand and respond to critical health challenges in relation to people's online behavior.

CONCLUSION & LIMITATIONS
This research synthesizes LLM techniques with theoretical perspectives from cognitive and social psychology to advance the knowledge of health decisions and outcomes in the context of the most recent pandemic.Our work is the first to systematically identify and characterize how causal language patterns surrounding anti-pandemic health practices on social media are significantly predictive of national health outcomes.These findings carry crucial implications for public health communication and policy interventions.By recognizing the influential role of causal language patterns across social media in shaping national health outcomes, public health efforts and online moderation practices can be tailored to address and mitigate the impact of social media conversations that adversely affect public health consequences.
Our study has a limitation in our data source: it concentrates on Reddit posts and omits comments.This exclusion is primarily due to certain months of comment data being either restricted or deleted in compliance with Reddit's policies by Archive administrators.While this focus allows for an in-depth analysis of original posts, it may not capture the full discourse, including diverse viewpoints and nuanced discussions that often take place in the comments section.Consequently, our findings may offer a limited perspective on the topic under study.Future work might consider alternate ways to capture community discourse, such as through interviews or surveys, to complement the data from Reddit posts.Furthermore, as datasets from the future expand, integrating machine learning models that are capable of detecting subtle changes in discourse over time and adjust to extensive datasets may offer a dynamic view of how gists evolve.This method has the potential to uncover patterns and trends that may not be immediately obvious when using a traditional unsupervised clustering approach.
In summary, we built an LLM-based model to identify psychologically influential mental representations-gists-from social media posts, demonstrated the links between these gists and public health events, and verified associations with user engagement and national health trends, with implications for HCI design and the promotion of public health.
r/Covid-19Mask They cancelled my membership against my will today because I refused to wear a mask.r/DebateVaccines This is the sample data

Figure 1 :
Figure 1: Our research empirically links social media conversations to national health outcomes related to COVID-19.We introduce Role-Based Incremental Coaching (RBIC), a Large Language Model (LLM) prompting framework.(A) Collecting Reddit datasets focused on communities known for opposing COVID-19 health practices.(B) Guided by Fuzzy-Trace Theory, we introduce a novel LLM framework called Role-Based Incremental Coaching (RBIC) to extract cause-effect pairs and formulate coherent gists capturing causal relations in texts.(C) Granger-causality tests and data analytics reveal the impact of these gists on community engagement and national health outcomes such as vaccination uptake and hospitalization rates.
Role-based Knowledge GenerationYour role is to understand the cause-effect relationships in social media posts.Can you provide a brief definition of what a cause-effect relationship is?Based on your role, can you explain the term, "causal gist" in relation to sentences that have causal coherence?So, given the sentence: I took the vaccine yesterday.I'm really sick now.

Figure 2 :
Figure 2: Illustration of the Role-Based Incremental Coaching (RBIC) prompting framework: RBIC incorporates role-based cognition and sub-task training to improve the model's comprehension of a specific task before generating the final output.

/:Figure 3 :
Figure 3: The upper portion of the illustration displays the progression of clusters across four-month periods.The line graph illustrates the month-by-month evolution of the number of posts containing gists, representing the central themes discussed on Reddit concerning health mandates.The graph highlights specific dates when each topic was most prominently discussed and presents relevant news events related to COVID-19 and health mandates during those periods.

Table 1 :
Sample results from applying the RBIC method for extracting cause-effect relationships and generating gists from Reddit posts discussing health mandates between May 2020 and October 2021, including the post content

Table 2 :
Inter-rater reliability scores for human evaluation of topic clustering results.Fleiss' kappa coefficient was calculated to assess agreement between six annotators judging whether each gist was correctly assigned to one of the five clusters listed.

Table 3 :
Representative examples are shown for each cluster from C1 to C5, highlighting the main ideas identified in the health mandate debate on Reddit.Cluster 6 is not included, as it lacks a distinct causal relationship or coherent gist.

Table 4 :
Granger causality test results analyzing the relationships between the daily volume of gists in Clusters (C1-C5) and online engagement behavior -upvote ratios (UR), number of comments (NC) -across Reddit discussions. ).