GuesSync!: An Online Casual Game To Reduce Affective Polarization

The past decade in the US has been one of the most politically polarizing in recent memory. Ordinary Democrats and Republicans fundamentally dislike and distrust each other, even when they agree on policy issues. This increase in hostility towards opposing party supporters, commonly called affective polarization, has important ramifications that threaten democracy. Political science research suggests that at least part of this polarization stems from Democrats' misperceptions about Republicans' political views and vice-versa. Therefore, in this work, drawing on insights from political science and game studies research, we designed an online casual game that combines the relaxed, playful nonpartisan norms of casual games with corrective information about party supporters' political views that are often misperceived. Through an experiment, we found that playing the game significantly reduces negative feelings toward outparty supporters among Democrats, but not Republicans. It was also effective in improving willingness to talk politics with outparty supporters. Further, we identified psychological reactance as a potential mechanism that affects the effectiveness of depolarization interventions. Finally, our analyses suggest that the game versions with political content were rated to be just as fun to play as a game version without any political content suggesting that, contrary to popular belief, people do like to mix politics and play.


INTRODUCTION
Over the past decade, there has been a significant rise in affective polarization in the US -"the tendency of people identifying as Republicans or Democrats to view opposing partisans negatively and copartisans positively" [27].Increasingly, partisans ascribe negative stereotypes to the other side, calling them closed-minded, unpatriotic, and immoral [12].In early 2020, 72% of Americans reported believing that the opposing party is "a serious threat to the United States and its people" and 59% reported somewhat or strongly believing that the opposing party is "downright evil" [28].This increase in affective polarization has important social, economic, and political ramifications that threaten to tear the fabric of American democracy.Americans are more reluctant to talk to opposing partisans, even about nonpolitical topics [68].Affectively polarized partisans are significantly less likely to be comfortable with outpartisans as friends or neighbors [26].Affective polarization also influences economic decisions, such as where people buy and how much they are willing to pay for goods and services [46].In the political realm, affective polarization reduces trust in an outparty government and reduces support to compromise with outparty elites, increasing partisan gridlock [24].Further, a recent study highlights a link between affective polarization and specific policy positions.Researchers found that as partisan animus increases, Republicans are less concerned about COVID-19 and are less supportive of mitigation policies, though their opposition is tempered by the level of infections in their county [14].Given its wide-ranging consequences, the high levels of affective polarization we observe today in US politics are extremely concerning. 1n this context, we introduce GuesSync!, a two-player casual online game that attempts to reduce affective polarization by addressing a key driver of polarization: people's misperceptions about the other side.While Republicans and Democrats have deep differences, perceived differences between these groups have been exacerbated over the past few decades [42] because of various factors such as selective mass media coverage, the rise of partisan outlets, and social media [26].These misperceptions result in increased partisan hostility [19] and, in turn, reduced deliberation with the opposing partisans [25].Indeed, a recent mega-study (n=32,059) testing 25 interventions designed by academics and practitioners to reduce Americans' partisan animosity and anti-democratic attitudes found misperception correction to be the most effective method [74].
Crucially, GuesSync!initiates a fun, playful nonpartisan norm around correcting misperceptions.Through the visual elements and game mechanics that do not provide partisan cues, the game constructs a 'magic circle" [71], a separate social and psychological space that players enter into when deciding to play a game, where the rules and norms of the game are activated, which likely supersede at least for the duration of the game, the hostile partisan norms we observe today.Further, by embedding a misperception-correcting intervention within a casual game, we expect to attract a larger audience than other misperception-reducing interventions such as hosting political discussions in facilitated groups [43] which are likely to be attended by only the most politically engaged.
To study the effects of playing the game, we performed a pre-registered between-subjects online experiment with 665 participants.Participants played one of three versions of the game: a control version without any misperception correction content, a mixed version that had some misperception correction content, and a fully political version that had misperception correction content in all rounds of the game. 2 After the game, participants answered a survey containing three affective polarization outcome measures (feelings thermometer ratings and social distance, willingness to talk to an outparty supporter) along with measures of potential mediators and moderators.We also collected game experience-related measures to compare across the three game versions.
We summarize a few key results.We did not detect a statistically significant reduction in negative attitudes towards outpartisans between the control version of the game and the two treatment versions (no main treatment effect).However, performing a pre-registered moderation analysis, we found that Democrats playing the treatment versions of the game exhibited warmer feelings towards Republicans than Democrats playing the control version.In line with prior research, Democrats over-estimated Republicans' support for conservative political views, and correcting them through the game resulted in reducing affective polarization.We did not observe a similar effect for Republican players, likely because our choice of game questions on Democrats' political views.Playing the mixed game version also increased the willingness to talk about political issues with outparty supporters compared to control.We also identified psychological reactance as a potential mechanism that might affect the effectiveness of depolarization interventions.Interestingly, we found no difference between game favorability ratings given by players playing the control version of the game and the two treatment versions, suggesting that adding more political questions to the game did not appreciably negatively impact how fun and enjoyable the game was.

Correcting misperceptions about Republicans and Democrats to reduce affective polarization
A wide array of social science research has established that partisans perceive wider differences between the two parties than the actual difference that exists [18,42].Numerous factors contribute to this perceived polarization.Mass media coverage typically focuses on polarization [39], and the most extreme politicians are extensively covered.Partisan outlets show both elites and ordinary outpartisans as extreme [20].Further, exposure to political discussions on social media, which are usually between strong partisans, also adds to the illusion that most outpartisans are extreme and have little common ground with the other party [4].Recently, there has been growing consensus that misperceptions about the outparty contribute significantly to affective polarization and correcting for these misperceptions can help reduce affective polarization [19,74].Social scientists have explored correcting different misperceptions about the outparty such as ideological extremity [13], political engagement [13], party composition [3] and group meta-perceptions [35] to reduce affective polarization.For example, Ahler and Sood [3] found that people significantly overestimated the extent to which outpartisans belong to party-stereotypical groups (for example, Democrats who are union members and Republicans who are Evangelical), and correcting for these misperceptions reduced outparty animus.Lees and Cikara [35] also demonstrated that people overestimated outgroup negativity towards the ingroup (group meta-perceptions), and correcting the inaccuracy reduced negative outgroup attributions.Similar group meta-perception corrections have been shown to be effective across over 25 countries [62].Druckman et al. [13] showed that, without any additional information, people imagine the typical outpartisan to be more ideological (liberal Democrat and conservative Republican) and more politically engaged than is the reality, resulting in outparty animus.When outpartisans are described as moderate (and modal Republicans and Democrats are, in fact, moderate), people exhibit reduced hostility towards outpartisans.Thus, correcting perceptions of the ideological extremity of the outparty can reduce outparty hostility.Although strictly speaking, affective polarization concerns both outparty hostility and inparty favoritism, scholars working on polarization primarily focus on the former as outparty animus especially has pernicious effects on political and social life.Similar to the studies mentioned above, our work also focuses on partisan affect and behavioral intent toward outparty individuals.
In this study, we expand on Druckman et al. 's study by correcting misperceptions about specific political views held by ordinary Republican and Democratic supporters rather than misperceptions about the ideological makeup of the parties' supporters.Focusing on supporters' views on specific political topics instead of simply how liberal or conservative they are allows us to present a more nuanced picture of the supporters' views and convey the complexity of their issue positions.This strategy is especially important as an analysis of nationally representative ANES survey data (Appendix Section 6 in [56]) shows that only about 8% of partisans hold ideologically consistent positions across multiple issues such as abortion, gun control, and welfare despite their prominence in electoral politics.Further, highlighting views on specific political topics provides more opportunities to establish common ground on a variety of topics.Thus, we expect that playing a game providing corrective information about party supporters' policy views can reduce affective polarization.
One concern with correcting misperceptions about outparty members is the potential for a "backfire effect" where the correction entrenches people's belief in the misperception, especially in cases where the issue is salient or identity relevant [53].However, as Nyhan notes in a recent survey article [52], these backfire effects are extremely rare.The above studies on misperception correction reducing affective polarization did not result in such effects.But even in the absence of backlash, Nyhan summarises that the effects of the misperception correction are only moderately effective owing to a range of factors: motivated reasoning towards claims that are more congenial, continuous elite and partisan messaging that reinforces misperceptions, lack of targeting factchecking towards people with the most exposure to misinformation and low levels of cognitive ability and processing effort among the public.Although not addressing all these factors, our game design did not provide additional partisan cues that encourage partisanship-motivated reasoning.Further, by designing the game such that more accurate answers are incentivized, the in-built accuracy motivation likely makes individuals more receptive to the corrective information than default or motivated reasoning, as has been observed in survey experiments [73].Also, since we correct misperceptions about party supporters' views and not factual beliefs (such as Barack Obama being born in the US), we likely encounter less resistance to corrective information on these topics.

Pro-social Games
Over the past decade, designers and researchers have aimed to change attitudes and behaviors through game mechanics and gameplay.These games have been designed to either be direct in their issue goals and game mechanics or be implicit and obfuscate the true intentions of the game.
By far, the most common approach is the direct one.These games rely on explicit procedural rhetoric [6] through the characters, rules, and scenarios modeled in the game, which overtly promote the desired outcome.The assumption is that this design will "encourage and enable players to internalize, and transfer, the game's modeled beliefs and behaviors to real-life contexts." [30] For example, in Darfur is Dying, players take on the role of a refugee trying to find water in a desert to bring back to the refugee camp while evading being killed by the militia.Playing the game elicited greater role-taking and increased willingness to help Darfurian refugees than simply reading a text containing the same information [58].Spent is another game that simulates a scenario where the players are single parents without a job or a home and need to survive on $1000 for the month.Testing on middle and high school students, playing Spent was found to have significantly increased affective learning scores, a measure of the internalization of positive attitudes towards homeless populations, even three weeks after playing the game [63].However, these overt persuasion approaches may not always be effective and sometimes even backfire, causing more harm to the target populations.For example, researchers found that playing Spent led players to believe that poverty is controllable and did not promote positive attitudes towards the homeless among online adult and undergraduate study participants [61].
Kaufman et al. [30] suggest that such explicit efforts may fail also because they might trigger psychological reactance [7].Psychological reactance theory suggests that individuals experience motivational arousal when they perceive that their freedom to think, act and hold opinions freely is threatened by an external agent.Such a state makes individuals more resistant to persuasion and may even lead to individuals shifting their behaviors or attitudes in the opposite direction of the perceived pressure [60].Thus, if players perceive a game to be overtly forcing opinions onto them, it might dampen the persuasive effects of the game.Further, making persuasion attempts direct and on-message may hamper the players' ability to fully immerse themselves into the transformative experience of the game.
A recent alternate approach is to incorporate stealth interventions within the game.Popularized by Kaufman et al. [30], this 'embedded design' approach aims to effect change by incorporating the persuasive mechanism in an implicit and subtle way within the game mechanics or game context rather than making the persuasive message the focal point of the game.One embedding strategy is intermixing, which interweaves and balances on-message and off-message content to make the persuasion non-threatening and palatable.In Buffalo, marketed as a party trivia game, players flip a person card (such as scientist) and a descriptor card (such as female) and need to name a real or fictional person that fits the descriptions in the cards ('female scientist') as fast as possible.The game employs intermixing by mixing on-message (stereotype-breaking) descriptor cards with off-message ones.The game also obfuscates its persuasive intentions by presenting simply as a party game without the de-stereotyping framing.Experiments [29] suggest that the game reduces prejudice and stereotyping by encouraging "greater inclusiveness in players' representations of social identity groups." In this work, we experiment with two treatment versions of the game, a fully political direct persuasion version where all questions are about political views held by Republicans and Democrats that explicitly aims to correct political misperceptions and an indirect persuasion version that employs the intermixing strategy and includes a few political misperception corrections but is still largely nonpolitical.We compare the effects of these two versions on affective polarization measures with outpartisans against a control version of the game containing no misperception-correcting information.

Design research on affective polarization
There are broadly two kinds of political polarization: affective polarization-the focus of this workand ideological polarization.Ideological polarization is the divergence of political beliefs and stances on policy issues towards the extremes.Most design research has focused on identifying and countering designs that exacerbate ideological polarization through filter bubbles.These approaches typically afford users to ability to break echo chambers and engage with alternate and divergent viewpoints [49][50][51].However, design research on reducing affective polarization is still in its nascent stage.
Settle's work [67] highlights how social media design exacerbates affective polarization by facilitating pejorative judgments of the political outgroup.She argues that Facebook's newsfeed exposes people to politically informative content about social contacts with whom politics might not even come up in offline interactions.This content enables users to draw inferences about the partisan identity of the person.Once a user is categorized into a partisan identity, they are attributed overly consistent ideological viewpoints, contributing to a heightened perception of polarization.Further, numeric details about support through social media likes and shares from highly politically engaged users also influence beliefs of how much support there is for even fringe viewpoints.This issue is further compounded as extreme partisans disproportionately share political content on social media [4].Overall, this partisan categorization results in a negative characterization of the outgroup and positive favoritism of the ingroup [70], leading to affective polarization.
Although only a few, some initial studies have attempted to reduce affective polarization through design.These approaches have primarily involved reducing the effect of partisan identities in online interactions.Rajadesingan et al. [59] use de-categorization and cross-categorization ideas from inter-group conflict literature to explore designs that reduce the salience of partisan identities to reduce partisan hostility during political discussions.Combs et al. [10] built a mobile chat application DiscussIt that pairs a Republican and a Democrat to have private dyadic anonymous discussions on controversial topics to reduce affective polarization.An experiment comparing participants using DiscussIt against a control group who wrote essays on the same controversial topics resulted in a reduction in affective polarization.DiscussIt discussions rely on reducing salient partisan identities and removing the audience feedback (typically through social media likes and shares), which implicitly influences interactions.Saveski et al. [65] take an alternate approach, facilitating perspective-taking to reduce affective polarization.Perspective-taking, where one takes the point-of-view of another, is known to foster empathy and reduce hostility in inter-group conflict situations [11].They built a browser extension that exposes users to others' Twitter timelines and found that viewing others' timelines reduces affective polarization especially if the exposure is framed with an empathetic prompt aimed to mitigate inter-group animosity.
In this work, we focus on correcting misperceptions that stem from perceived polarization in a game setting with minimal partisan cues so as not to activate players' partisan identities.By modeling the game as a private dyadic interaction, similar to DiscussIt, we allow players to express themselves freely, unencumbered by what the audience might think of their answers in the game.We discuss more design considerations in Section 3.2.

Game details
GuesSync! is an online two-player cooperative casual game.In the game, each player is randomly matched with another player.The game consists of multiple rounds.In each round, the two players are shown a question.They work together as a team, provide clues and guess the answer.The game design was inspired by two popular games: Family Feud3 , a popular cable network game where players work as a team to guess survey answers and Wavelength4 , a social guessing party game where teams try to read each other's minds using clues.
When a player lands on the game homepage (www.guessync.com),they select their game avatar and input a player name (Figure 1a).Then, the player is shown a tutorial on how to play the game (Figure 1b).After the tutorial, the player enters the matching lobby, where they are randomly matched with another player.Once matched, players can use the in-game chat to talk to their partner and start the game (Figure 1c).Each game consists of seven rounds.Players play two trial rounds followed by five game rounds.The trial rounds are nearly identical to the game rounds except that they provide helpful tips on using the game UI and that no points are awarded.After the five game rounds, players view a game summary listing the total points they scored, and for each round, the question, the correct answer, the team's answer, and the points scored (Figure 2e).Each round consists of four phases: the initial guess phase, the clue-giving phase, the final guess phase, and the grand reveal phase.We describe the four phases below: 3.1.1Initial guess phase.At the beginning of each round, both players are asked to independently provide their best guess answer for a question using a slider (Figure 2a).All questions require players to guess a percentage amount, for example, 'what percent of adults have seen the movie Titanic?' Depending on the game version, these questions may be about party supporters' political views.For example, what percent of Republicans (Democrats) think that high-income individuals pay too little in taxes?Players are given 60 seconds to come up with their best guess.3.1.2Clue giving phase.Then, the game assigns one player as the clue-giver and the other as the guesser.The game reveals the correct answer only to the clue-giver.The clue-giver must convey the correct percent to the guesser using a scale provided by the game, for example, a hot-cold scale.
The clue-giver needs to develop a clue using the hot-cold scale to help their partner guess correctly (see Figure 2b).Here, a good clue would be something the partner can identify as being more cold than hot, as the target is closer to the cold end of the scale.'Lemonade' might be a good clue for this example since it's usually consumed cold.If the correct answer was 5% (close to hot), 'sun' might be a good clue.If the answer was 95% (close to cold), 'arctic' might be a good clue.The scales change in each round, and the players take turns being the clue-giver and guesser.The clues must be only one or two words long, cannot have more than 20 letters, and cannot include numbers nor quantifier words such as lot and little or direction-related words such as left and right.Guesses also cannot include words like same and correct that convey the answer without using the provided scales.We maintained a blocklist of such words to ensure that players used clue words that were conceptually on the scales provided.The clue-giver is given two minutes to enter their clue.While the guesser waits for the clue, they are also provided the scale and the clue-giver's initial guess.They can use this time to think of potential clues the clue-giver might give and what percentage the clues might correspond to.Clue-giver is provided 120 seconds to provide a clue.

Final guess phase.
Once the clue-giver inputs the clue, the guesser must interpret the clue according to the scale and input their team's final answer (Figure 2c).The guesser is given 60 seconds to make their final guess.

Grand reveal phase.
After the final answer has been submitted, the correct answer is revealed to the guesser, and the final guess is revealed to the clue-giver (Figure 2d).Points are awarded based on how close the final guess is to the correct answer.Teams get 5 points if their final guess is within 5% of the correct answer.Teams get 2 points if their final guess is within 10% of the correct answer.Players can talk to each other in this phase through the in-game chat window.Players can either type into the chat or choose one of the game-suggested text input prompts (for example, 'great job!', 'good clue').The chat is disabled during the other three phases of the game.

Key game design considerations
In developing GuesSync!, we made several key design decisions to maximize the game's effect on affective polarization.We discuss these decisions below: No prior political knowledge needed.The game was designed so that players do not need to know the answers to questions to enjoy the game.We deliberately avoided designing it as a political trivia game as such games likely attract only individuals who are interested in politics, especially when a significant portion of the population is agnostic or downright detests politics [34].When answering political questions in the game, while knowledge of politics may help, the game primarily revolves around players being able to provide clues based on the scale provided and their partners being able to interpret the clues accurately.

Minimal partisan cues.
The game was designed to avoid presenting partisan cues which may cue partisan-motivated reasoning and bias.We do not ask about the players' political leanings at any time during the game.The avatars that the players can choose for themselves are cute animals5 instead of humans, as demographic details may also cue partisan identities since the two parties are also increasingly sorted along racial lines [45].Further, to avoid potentially priming partisan identity through red and blue colors (commonly associated with the two political parties) [66], we designed the game website such that the primary color scheme is green.

Direct and indirect focus on correct estimates.
In each round, the game first asks both players to independently answer a question with the correct answer revealed through the course of the round.This approach provides dedicated time at the beginning of the round for players to reflect and input their best guess answer.Then, in the clue-giving and guessing phases, the game still engages players with the correct answer in indirect ways: the clue-giver works on translating the correct answer to a concept on the scale, and the guesser works on translating the concept back to a percentage.These phases encourage more focused engagement with the answer than when individuals are directly provided numerical estimates, as is the case when simply reading news reports.

Interactive design.
Players provide all their percentage answers using sliders.Studies [54] have shown that the physical act of clicking and dragging sliders as opposed to simply clicking or hovering creates an immersive experience resulting in cognitive absorption, a state where the person is "consciously involved in an interaction with almost complete attentional focus", which in turn is associated with being more receptive to persuasion.

Slow thinking.
We allocated fairly liberal time limits for each game phase.We provided one minute for players to provide their initial guess, two minutes for the clue-giver to come up with a clue and one minute for the guesser to provide the final guess.We did not want the game rounds to have rapid-fire style interactions that likely resulted in top-of-the-head responses.By providing adequate time to think through, we allow for slow thinking and more considered responses that experiment data suggest result in smaller levels of misperceptions [2].

Credibility.
To increase the credibility of the game answers shown, we state that the sources for questions in the game are from reputed nonpartisan sources such as Gallup and YouGov both at the beginning of the tutorial and at the end when players view a summary of the game.
3.2.7 Team interactions.Players could optionally chat with their teammates before and after each round.The feature allows players to interact and connect with their teammates.Critically, it also provides opportunities for in-game discussion and reflection of especially surprising answers, which can aid retention [47].Though chatting was optional, the median number of comments made by players in the experiment was 6.

Playtesting and refinements
To refine the design, we playtested the game in two phases.First, we recruited eight players through TurkerNation, a collective of crowdworkers on the Amazon Mechanical Turk platform.After providing informed consent, they played the game online, completed a post-game survey and were interviewed by the first author to obtain their feedback on ways to improve the game.Together, the game and interviews took about 30 minutes.Participants were paid $7 to complete both the game and the interview.In the second phase, we playtested the game directly on the MTurk platform.Eighteen workers completed the game and a post-game survey.We paid $3.75 for their participation.In both playtesting phases, we collected all inputs that players used in the game, including their initial and final guesses to questions, the clues they provided and their chat messages.Through playtesting, we refined the game in the following ways: (1) Based on interview feedback that it took a few rounds initially for players to learn how to play the game, we added two practice trial rounds to the game.While functionally identical to the other game rounds, these rounds were not scored and included helpful tooltips and instructions on how to use the game interface.They were also helpful for players to get in sync with their partners.(2) From the list of clues provided during the game, we inferred that some players did not use the scale to provide clues and instead used quantifier words such as lot and little and direction-related words like left/right and higher/lower to convey the correct percent.We added these words to our existing blocklist of clues.(3) In an initial version of the game, we did not have any time limits to guess the correct percentage or to provide clues.However, to keep the game moving and to detect when a player left the game midway, we had to institute time limits.As discussed earlier, it was important to provide enough time for the players to think through their answers instead of responding on the fly.We settled on providing a minute for players to input their guesses and two minutes for them to provide clues.One concern with providing a lot of time is that while the clue-giver takes time to come up with a clue, the guesser will have to wait and might lose interest.However, through our interviews, we found that the waiting period increased anticipation and added to the excitement.As one participant put it, "it was like waiting to open a Christmas day present...If [the clue] took a while, it must be a doozy!"During the waiting period, we also included a nudge "use this time to think of possible clues that [your partner] might come up with." to keep the guesser focused on the game.(4) We also made other minor UI changes to the game such as updating the game instructions with clearer directions, providing information on the number of rounds completed and how many more rounds to go to finish the game, and adding a game summary page after completing the game containing all the game questions, answers and points scored.

Selecting game questions and scales
The game requires three main components: questions on party supporters' political views, nonpolitical questions and scales.We used crowdsourcing and publicly available surveys to select these components.We describe the process below: 3.4.1 Selecting questions on party supporters' political views.To obtain an initial set of questions on party supporters' political views, we used nationally representative survey data from the 2020 American National Election Studies (ANES) Time Series Study, 2020 Cooperative Election Study (CES) and the 2021 General Social Survey (GSS).We manually selected all questions on political views from these sources and, using the survey data, obtained the percentage of Republicans and Democrats who held those views.Then, we used an Amazon Mechanical Turk (MTurk) survey to identify a subset of these questions to be used in the game.The selected questions were on policy issues such as gun control and immigration that Republican and Democrat MTurk workers considered most important but had the highest levels of misperception on the survey.The detailed procedure to select these questions (Section SM1.1) and the actual questions used (Tables SM1 and  SM2) are provided in the Supplementary Materials.
3.4.2Selecting nonpolitical questions.We selected nonpolitical questions from publicly available nationally representative surveys conducted by YouGov and Ipsos available on their websites.We manually identified questions from seven broad, largely nonpolitical categories: pets, relationships, supernatural, entertainment, hobbies, food and lifestyle.We selected a subset of the questions that MTurk workers expressed the most curiosity about using an MTurk survey.The detailed selection procedure and the actual questions used are provided in Section SM1.2 and Table SM3.

Selecting scales.
To select the scales to be used in the game, we constructed word pairs from online word lists and Wavelength game cards.We used another MTurk survey to select scales that we used in the game.The details of the selection process and the actual scales used are provided in Section SM1.3 and Table SM4.In all the above three tasks, we limited the MTurk participant pool to only US-based MTurk workers who had at least a 98% task acceptance rate and had completed at least 1000 tasks.

Game Development
The game was developed using Javascript and React, building on the codebase of an open-source version of the Wavelength game. 6The game was hosted using the Google Firebase platform: we used the Realtime and Firestore Databases to store game data and Cloud Tasks to manage matching users in real-time.We used StreamChat7 library to facilitate in-game chatting.

Effects of playing the different games versions on affective polarization
The studies [3,13,35] previously discussed in Section 2.1 indicates that correcting misperceptions about the outparty is likely to reduce affective polarization.Political scientists have adopted multiple measures of partisan affect to quantify affective polarization, each providing a slightly different insight into the phenomenon [26].One approach is to measure feelings towards the outparty supporters using "feeling thermometers." Here, respondents are directly asked to provide a rating on a 101-scale from 0 (cold) to 100 (warm) for Democrats and Republicans.A higher rating indicates warmer or more favorable feeling towards that group.An alternate approach is to quantify outparty social distance which measures how comfortable individuals are with outparty supporters in different scenarios and social settings such as having outparty supporters as close friends, neighbors and children's spouse.The feelings thermometer ratings measures attitudes towards the outparty in general whereas the social distance measure measures attitudes towards a specific circumstance. 8Though less broad than the feelings thermometer ratings, the social distance measure does not quantify intent towards any specific behavior or interaction with the outparty.Therefore, we additionally include a behavioral intent measure that quantifies participants' willingness to engage in political and nonpolitical conversations with outparty supporters on a 5-point scale.Together, these three measures provide a fairly comprehensive evaluation of the effect of playing the game on affective polarization.As detailed in Section 2.1, we expect that playing games delivering misperception-correcting information will reduce affective polarization measured as higher outparty warmth, lower social distance and higher willingness to engage with outparty supporters.We formulate the following hypotheses: H1a: Players playing the mixed and fully political game will exhibit higher outparty warmth than those playing the control version.
H1b: Players playing the mixed and fully political game will exhibit lower social distance than those playing the control version.
H1c: Players playing the mixed and fully political game will exhibit higher willingness to talk to outparty supporters than those playing the control version.
Note that we do not have a prediction about whether the mixed or fully political versions might have larger treatment effects on the desired outcomes and do not test for them.Given the polarized current political climate and the ordinary Americans' disdain for partisan politics [32], an overt attempt to correct perceptions about party supporters may result in psychological reactance as described earlier (Section 2.2), which may result in reduced effectiveness of the intervention.At the same time, the mixed version of the game contains little corrective information.Participants could be distracted by other more exciting aspects of the game, resulting in smaller treatment effects.From a practical standpoint, we powered our study to detect a difference in measures of outparty feelings between the control and treatment game versions. 9We do not expect the difference in treatment effects between the two treatment game versions to be large enough to be able to detect them.The primary purpose of this study is to compare both treatment game versions to the control version.

Comparing the favorability ratings of the different game versions
We also do not have a prediction about which game versions the players will like more.However, knowing which game version players like more can inform future iterations of the game.Therefore, we test for differences between the control version of the game and the two treatment versions on game favorability ratings.10RQ1: Are there differences between game favorability ratings provided by two treatment version game players and the control game players?

Underlying mechanisms
We examine three potential mechanisms that might mediate reducing affective polarization: perceived commonality, party stereotyping and psychological reactance.
Research suggests that individuals assume the outparty supporters hold more extreme policy positions than what they actually hold.This results in what scholars term, (mis)perceived polarization [36], which is even more strongly associated with negative evaluations of the outparty than is actual polarization [16].By correcting misperceptions of outparty supporters' views, we expect players to recognize that the outparty is closer to their own views and share more in common than they previously perceived.In a similar study, Levendusky and Stecula [43] also identified perceived commonality as a potential mediator in reducing affective polarization through cross-party dialogue.Thus, we expect playing the game to increase perceived commonality with the outparty, which in turn would reduce affective polarization.
H2a: Perceived commonality mediates the effect of playing the game on affective polarization.
An alternate potential pathway through which correcting misperceptions can reduce affective polarization is by reducing outparty stereotyping.In misperceiving that the outparty members hold extreme issue positions, individuals also overestimate the extent to which outparty members are ideologically consistent.In one study, participants chose to ascribe remarkably consistent ideological positions ( > 0.7) across five issues domains (abortion, taxes, Obamacare, gun control, and immigration) to a social media user based on viewing only one of their Facebook posts [67].Yet, as discussed earlier, only about 8% of partisans hold ideologically consistent positions (Appendix Section 6 in [56]).This stereotyped inference that the outgroup members are "all the same" (called the out-group homogeneity effect) is closely associated with negative evaluations of the outgroup [57].By playing the game, players likely come to realize that not all party supporters hold ideologically consistent positions on every issue, thereby reduce outparty stereotyping, which in turn would reduce affective polarization.
H2b: Outparty stereotyping mediates the effect of playing the game on affective polarization.
Finally, as discussed earlier, we examine psychological reactance as a potential mechanism that inhibits reducing affective polarization.The psychological reactance framework has been used to understand individuals' resistance to persuasive messages around health and science communication such as promoting anti-smoking messages [17] and combating climate change denialism [44].Given that partisan identity is an integral part of individuals' self-concept [75] and since exhibiting hostility towards the opposing party is a significant way for individuals to express their partisan identity [1], interventions that are perceived to curb this expression might induce psychological reactance.Players may feel that the game forces them to temper their opinions about outpartisans, and this perceived lack of freedom to think freely may result in the intervention backfiring.
H2c: Psychological reactance mediates the effect of playing the game on affective polarization.

Subgroups of interest
We analyze the effects of playing the game on four key subgroup classifications: party identification, party strength, size of misperception and political knowledge.This subgroup analysis is essential to understand which aspects of the game needs improvement in future iterations.Given the significant differences between Republicans and Democrats, and especially considering that Republicans are becoming radicalized at a much faster rate [28], we examine if the game has heterogeneous effects on the supporters of the two parties.Also, research suggests that strong partisans, as a consequence of having a more ingrained partisan identity and stronger motivated reasoning, would be less inclined to moderate feelings of outparty hostility than weak partisans [32].Thus, we examine potential differential effects for strong and weak partisans.Past research also suggests that higher political knowledge is correlated with stronger affective polarization [69].Therefore, we compare the effects of playing the game on the high and low political knowledge groups.Finally, given that the game aims to reduce misperceptions about party supporters' policy views, we examine if it has differential effects on participants with high and low initial levels of misperceptions.Overall we examine the following research question: RQ2: Are there heterogeneous treatment effects of playing the game by party identification, the strength of partisanship, political knowledge and size of misperceptions?

Experimental conditions
To evaluate the effect of playing the game on affective polarization, we performed a pre-registered 11between-subjects experiment on Amazon Mechanical Turk where participants were assigned to one of three game versions.All three game versions had seven rounds, two trial rounds and five game rounds.
(1) The control version of the game supplied no corrective information and had seven rounds of nonpolitical questions.
(2) The mixed version of the game had two misperception correcting questions (one Democratrelated and one Republican-related) in the second and fifth (final) game rounds and five nonpolitical questions in the other rounds.(3) The fully political version had seven misperception correcting questions (3-4 Republicanrelated and 3-4 Democrat-related) which were ordered at random.The University IRB reviewed the study and determined that it is exempt based on federal exemptions 3(i)(A) and 3(i)(B).

Recruitment and experiment procedure
This experiment was performed in 27 batches (May 12, 2022 -May 31, 2022) as it required players to be present online at the same time.The median number of participants per batch was 21.In each batch, participants were randomly matched and assigned to play the control, mixed or fully political game versions.
Approximately one hour before the start of each batch, we published a task (Human Intelligence Task, HIT in MTurk parlance) where workers indicated if they were available to play the game at a proposed time and if they could use a laptop or desktop to play the game (as we did not support playing the game on mobile devices.).Participants were also informed that the game would close for new players within five minutes of the scheduled time.This was done to ensure that most participants would start the game simultaneously and be matched with another participant.In the scheduling task, we collected demographic details such as age, gender and 7-point party identification scale 12 and how often they played party games.From the 16th batch onwards, we included a simple captcha-type question to ensure that the players were real people (not bots) and could follow English instructions. 13Participants satisfied the following conditions were invited to play the game: (i) they correctly completed the captcha question (if shown to them), (ii) they indicated being available at the said time and could use a laptop or desktop, and (iii) they were not political Independents. 14We limited this scheduling task to only US-based MTurk workers who had at least a 98% HIT acceptance rate and had completed at least 1000 HITs.We also excluded workers who playtested the game or participated in any game content creation tasks described in Section 3.4.We also excluded workers who had previously completed the scheduling HIT in an earlier batch.All workers, regardless of whether they were invited to play the game, were paid $0.10 for completing this scheduling task.
Then, 10 minutes before launching the game task, we sent a notification through the MTurk platform reminding them of the game start time and providing instructions on finding the game on the platform.At the said time, we launched the game and again sent a reminder that the game was launched.The game closed for new participants 7 minutes after the game was launched (2 minutes more than the 5 minutes in the scheduling instructions to allow for stragglers).After providing informed consent, participants land on the game home screen where they choose an avatar and provide a game name.Then, participants were provided a tutorial on playing the game and provided multiple examples.From batch 10th, we changed the tutorial such that participants had to spend at least a minute on the tutorial before moving on to the matching screen. 15In the matching phase, participants were matched with another participant (if available) to form a team, and the team was randomly assigned to one of the three game versions.If participants were not matched with another person in three attempts, they were provided $0.50 as compensation for their time.Only 50 participants were not assigned a partner and had to leave.
Once matched, participants played the game.In the experiment, if a player did not provide a valid input for more than 90 seconds in the two guessing phases or did not provide a valid clue or hit the pass button after 150 seconds in the clue-giving phase, we assume that the player has left the game.In that case, we redirect their partner to the post-game survey to complete the HIT.
When players completed the game, they filled out a post-game survey to complete the HIT.We paid $3.75 to all participants who completed the HIT.In total, 777 participants completed the post-game survey.Of the 777, 103 participants completed the survey after their partner left the game mid-way.There was no major difference in dropoffs across the three conditions.31 control, 36 mixed and 36 fully political version players dropped off the game mid-way.Among the 674 participants, nine indicated that they were political Independents in the post-game survey and were removed from the analysis.In total, for our analysis, we have data from 665 participants: 224 control version players, 225 mixed version players and 216 full version players.

Measures
We outline all the survey measures in Table 1.Outparty feelings.We use feelings thermometer measures to measure feelings towards outparty supporters on a 0-100 scale.Lower ratings represent colder/negative feelings toward the outparty supporters and higher ratings represent warmer/positive feelings.We collected feelings thermometer ratings towards Republicans ( = 45.05, = 24.60)and Democrats ( = 37.59,  = 24.43),and used the participant's party affiliation to determine outparty feelings (overall,  = 39.72, = 24.70).An increase in feelings thermometer ratings would imply that individuals exhibited more warmth (or less negativity) towards outpartisans indicating a reduction in affective polarization.
Social distance.To measure social distance, we used a standard 4-point scale measuring how comfortable/upset the participant would be with having an outparty supporter as a close friend, neighbor or relative ( = 0.83,  = 2.07,  = 0.76).A decrease in social distance would imply individuals became more comfortable (or less upset) with outpartisans in social settings, indicating a reduction in affective polarization.
Willingness to talk to outpartisans.We used two items to measure willingness to engage with outparty supporters on a 5-point scale.We asked how willing participants were to have political conversations with outparty supporters ( = 3.43,  = 1.30) and how willing they were to have nonpolitical conversations with outparty supporters ( = 4.28,  = 0.95).As the Cronbach  was relatively low (0.55) for these measures, we did not combine them.An increase in these measures would indicate a decrease in affective polarization.

Mediators.
Perceived commonality.To gauge perceived commonality, we used Levendusky and Stecula's [43] two-item measure asking participants how much they agree on the two statements on a 5-point scale: "There are many policy areas where Democrats and Republicans agree and can find common ground to work together." and "Democrats and Republicans agree on many more issues than the media says that they do."As the two items were highly correlated, we combined them by taking their mean (higher means more commonality,  = 0.80,  = 3.39,  = 0.96).
Outparty stereotyping.To measure outparty stereotyping, we used a two-item measure asking participants how much they can tell about a person's political policy preferences by knowing that they are an outparty supporter and how much they can tell about a person's other values and goals by knowing that they are an outparty supporter.As the two items were highly correlated, we combined them by taking their mean (higher means more stereotyping,  = 0.80,  = 3.31,  = 0.90).
Psychological reactance.We derived our psychological reactance measure using Moyer-Gusé et al.'s cognitive reactance scale [48] on measuring reactance to persuasive messages.Using a three-item measure, we asked participants how pressured, manipulated and forced they felt to form certain viewpoints about Republicans and Democrats.Since the three items were highly correlated, we combined them by taking their mean (higher means more reactance,  = 0.94,  = 2.18,  = 1.20).

Moderators.
Party identification and partisan strength.We also collected participants' party identification (Strong Democrat:38.34%,Weak Democrat:16.54%,Lean Democrat: 15.94%, Lean Republican: 11.12%, Weak Republican: 6.01%, Strong Republican: 12.03%) in the post-game survey.For all our analyses, we used the post-game party identification.For 56 participants, because of a glitch in the post-game survey, we recorded their post-game party identification (Republican/Democrat) but not their party strength (Strong, Weak, Lean Republican/Democrat); we used the party strength that they provided in the pre-game scheduling survey instead.For the party-level moderation analysis, we binarized the party identification data into Democrats and Republicans.For the partisan strength moderation analysis, we classify strong Democrats and Republicans as strong partisans and other (Weak/Lean) Democrats and Republicans as weak partisans.
Political knowledge.To gauge the political knowledge, consistent with prior research [21], we used questions that evaluated participants' current political knowledge and more generally, their understanding of the U.S. political system.We asked four factual multiple-choice political questions: Do you happen to know who the majority leader in the U.S. Senate is? Do you happen to know which political party has a majority in the U.S. House of Representatives?In the case of a tied vote in the U.S. Senate, who casts the deciding vote?What is the U.S. Electoral College?We aggregated the number of correct answers that they provided ( = 2.74,  = 1.16).For the moderation analysis, we classified participants who correctly answered at least three of the four questions as high political knowledge participants (402 participants) and the rest as low political knowledge participants (263 participants).In-game outparty misperception.For players playing the treatment game versions, we measured outparty misperceptions as the difference between the answer to questions about outparty supporters' views provided by players during the initial guessing phase and the correct answers to those questions (survey estimates).We incorporate the direction of the misperception as follows: if the participant misperceives Republicans' views to be more conservative than their actual views or if the participant misperceives Democrats' views to be more liberal than their actual views, then we assign a positive sign to the misperception magnitude, else we assign a negative sign to the misperception magnitude.Figure 3 shows the distribution of outparty misperceptions by party (Democrats:  = 37.64,  = 23.03,Republicans:  = −9.93, = 26.36).We find that while Democrats (as expected) misperceive Republicans to be more conservative, surprisingly, Republicans misperceive Democrats to be more conservative.We detail the potential reasons and consequences of this phenomenon in Section 5.6.From Figure 3, given that size of misperception in our study is heavily correlated with party id ( = 0.66,  < 0.01), we did not perform the pre-registered moderation analysis based on outparty misperception.

Game-related measures.
Prior Game experience.To measure players' game-playing experience, we asked "how often do you play party/card/board games?" on a 4-point scale ( = 3.25,  = 0.70).

Manipulation and attention checks.
Political game check.We asked participants on a 5-point scale how political participants thought the game was.As expected, the control version was perceived as the least political ( = 1.67,  = 1.18), followed by the mixed version ( = 2.40,  = 1.03), followed by the fully political version ( = 4.53,  = 1.08).
Attention check.We asked two instructional manipulation checks [55] to test whether participants paid attention to the questions and followed the written instructions.98% of participants passed both checks, and no participant failed both checks.No participant completed the survey in less than 45 seconds which was the pre-registered threshold to remove them from the analysis.
(Post-game) misperception correction check.To evaluate if the participants actually registered the misperception correction information presented in the game, at the end of the post-game survey, we asked treatment game version participants two political questions that they were previously asked during the game.To make answers comparable across the two conditions, we asked the fully political game players the questions they answered in the second and fifth rounds.Participants playing the control game were asked two political questions at random from our pool of political questions (1 Democrat and 1 Republican-related question).If our game was effective in correcting misperceptions and participants could recall them, we would find that the players who played the mixed and fully political conditions, on average, provided answers closer to the correct answer than players who played the control version of the game.Indeed, we found that mixed ( = 15.88, = 14.86) and fully political version ( = 21.87, = 16.85)players on average supplied answers with lower levels of misperception than the control version ( = 30.30, = 14.99).Interestingly, we found that the mixed version players were significantly more accurate than fully political version players ( = 2.14,  < 0.05), perhaps because political questions in the mixed version were rare and salient, thereby improving recall.Note that this measure determines the post-correction misperception size, whereas the measure in Section 5.3.3measures the pre-correction misperception size.

Pre-registered analysis plan and deviations
We ran an OLS regression with random effects for teams and experiment batches to estimate the main effects of playing the treatment games on outparty feelings.We control for relevant sociopolitical variables such as age, race, gender and party identity and game-related variables such as past gaming experience and ratings that players gave to the game, their partner and to their own play.To estimate the main effects on social distance, we performed a similar regression analysis with the same controls with the social distance measure as the dependent variable.To estimate the main effects on willingness to have political conversations with outpartisans and willingness to have nonpolitical conversations with outpartisans, we ran two separate ordinal regression analyses using the same control variables and random effects as above.As per our pre-registration plan, we did not combine the two measures as the Cronbach Alpha was 0.55.Finally, we ran an ordinal regression analysis to compare the game favorability ratings across conditions using the same control variables and random effects as above.We used the 4 package [5] to run the random effects OLS models and the  package [9] to run the ordinal regressions.
We deviated from our pre-registered plan in a few ways.First, we planned to control for the experiment batch by adding a fixed effect.However, since there were 27 batches and each batch had 10-46 participants, we decided to control for the experiment batch as a random effect.The random effect allows for partial pooling of individual batch effects, reducing overfitting.Second, we planned to control for educational attainment, but because of a coding error, the measure was not collected and not included in the analysis.Similarly, because of a coding error, we did not collect outparty stereotyping, social distance and willingness to engage with outpartisan measures for the first 56 participants.We removed those participants from any analysis of the aforementioned measures.
To examine whether perceived commonality, outparty stereotyping and psychological reactance mediate these outcomes, we ran mediation models with controls for demographics and game-related variables using PROCESS package in R [23] 16 .We also examined how the main effects vary by party identification, party strength and political knowledge.To evaluate moderation effects, we used a single random-effects OLS regression modeling outparty feelings with controls for demographics and game-related variables and an interaction term between the treatment condition and each moderator variable.Although we pre-registered to examine treatment effects on participants with low and high outparty misperception, we did not perform the analysis as the size of misperception in our study is heavily correlated with party id ( = 0.66,  < 0.01) as observed in Figure 3.We use the  R package [37] to estimate the contrasts between playing the treatment game versions and the control version for the subgroups.We note that we did not pre-register to analyze moderation by political knowledge; however, there is strong evidence to suggest that individuals who have higher political knowledge have more polarized attitudes [31,69].Further, unlike other interventions, we expect the game to be played by people who are not necessarily politically engaged and knowledgeable, so understanding how political knowledge might moderate the main effects is important for this study.However, we note that the experimental setup is powered to detect main effects only and that the other analyses are exploratory.

Main effects of playing the mixed and political versions of GuesSync!
We evaluate H1 set of hypotheses that predicted a positive effect of playing the treatment versions of the game on the affective polarization outcome measures of outparty feelings, social distance and willingness to engage in political and nonpolitical talk with outparty.
First, we examine the effect of the treatment games on outparty feelings (H1a).The left column in Table 2 shows the coefficients from the pre-registered OLS regression model predicting outparty feelings controlling for demographic variables and game experience.We do not find reliable evidence of an increase in outparty warmth when playing either treatment version versus playing the control version.Therefore, H1a is not supported.
Next, we examine the effect of the treatment games on social distance (H1b).The right column of Table 2 shows the coefficients of the pre-registered ordinal regression model predicting social distance controlling for the aforementioned variables.Similar to the outparty feelings ratings, we find no reliable evidence of a reduction in social distance when playing the treatment versions of the game compared to the control version.Therefore, H1b is not supported. 17nterestingly, from the pre-registered models, we find surprisingly consistent evidence that Republicans in our sample are less affectively polarized than Democrats, exhibiting about 8.36 degrees more warmth towards Democrats and about 0.38 points less socially distant towards Democrats than vice-versa.We delve into potential reasons for this in Section 5.6.We examine the effect of the treatment games on willingness to engage in political and nonpolitical talk.We report the coefficients of the pre-registered ordinal regression models with game and experiment batch random effects controlling for demographic and game-related variables in the left and center columns of Table 3.We find that participants playing the mixed version of the game exhibited 43% higher odds of willingness to engage in political discussions than players playing the control version ( = 0.359,  = 1.43,  = 0.027 using a one-tailed test as per pre-registration).
Playing the fully political game did not result in a statistically significant increase in willingness to talk politics with outparty supporters, but its effects were directionally similar to that of the mixed game version ( = 0.301,  = 1.35,  = 0.051 using a one-tailed test as per pre-registration).
However, neither playing the mixed or the fully political version resulted in a reliable increase in willingness to have nonpolitical conversations with outparty, perhaps partly due to ceiling effects ( = 4.28 out of 5, Section 5.3.1),meaning people are already quite open to engaging in nonpolitical topics with outparty supporters.Given that we find support that playing the treatment versions of the game improves willingness to talk with outparty members on political issues but not on nonpolitical issues, H1c is partially supported.Finally, we examine the effect of introducing political content in the treatment games on game favorability ratings (RQ1).The right column in Table 3 shows the co-efficients of an ordinal regression modeling game ratings with demographic controls. 19We find no significant difference between ratings given to mixed and political game versions compared to control. 20Given the reasonable number of observations, the lack of evidence of a main effect is unlikely due to small sample sizes.It appears that adding political questions to the game does not significantly change how people rate the game.Further supporting this conclusion, we also include additional analyses that we performed to understand how players engaged with the game.Table 4 shows, by game version, the mean and standard deviations of key game perception metrics we collected.Consistent with our analysis on game ratings, we find that on all other measured metrics such as likely to play again, likely to recommend to friends, how fun, informative and surprising the game was, the ratings for the treatment versions were comparable to the nonpolitical control version of the game.

Mediation analyses.
We analyze the indirect effect of playing the treatment game versions on the outcome measures through perceived commonality, outparty stereotyping and psychological reactance (H2 set of hypotheses).Figure 5.5.2 shows the parallel multiple mediation model with the regression coefficients.Table SM11 shows the direct and indirect effects on the outcomes as well as the standard error and 95% percentile confidence intervals which are calculated from 5,000 bootstrap samples.
First, we examine the mediating effect of perceived commonality on the affective polarization measures (hypothesis H2a).From Table SM11, as the 95% confidence interval include zero for all outcomes, we cannot definitively conclude that the indirect effect of playing the treatment games on the affective polarization measures through perceived commonality is not zero.Therefore, the mediating effects of perceived commonality are not significant and thus, H2a is not supported.
Next, we examine the mediating effect of outparty stereotyping on the affective polarization measures (hypothesis H2b).From Table SM11, similarly, as the 95% confidence interval include Note: The mixed game denotes the difference between the mixed game and control condition.The fully political game denotes the difference between the fully political game and control condition.All numbers are regression coefficients.Solid lines represent denote statistically significant relationships ( * indicates  < 0.05, * * indicates  < 0.01), gray dotted lines denote non-significant relationships.We do not include the direct effects in this figure to reduce clutter but the direct effects are available in Table SM11.
zero for all outcomes, the mediating effects of outparty stereotyping are not significant and thus, H2b is not supported.Finally, we examine the mediating effect of psychological reactance on the affective polarization measures (hypothesis H2c).From Table SM11, we find that the indirect effect through psychological reactance from playing the mixed game version (effect = 2.435,  = 0.715,  [1.163 − 3.942], standardized effect = 0.098) as well as from playing the fully political game version on outparty feelings (effect = 3.493,  = 0.788,  [2.068 − 5.195], standardized effect = 0.140) were statistically significant.Similarly, an indirect effect through psychological reactance was observed on social distance from playing the mixed (effect = 0.027,  = 0.013,  [0.004 − 0.057], standardized effect = 0.036) and fully political game (effect = 0.039,  = 0.018,  [0.007 − 0.077], standardized effect = 0.051) versions.An indirect effect also was observed on willingness to engage in outparty nonpolitical talk when playing both the mixed (effect = −0.077, = 0.025,  [−0.129 − −0.033], standardized effect = −0.0804)as well as the fully political game versions (effect = −0.110, = 0.030,  [−0.174 − −0.057], standardized effect = −0.1154).Based on the signs of the indirect effects, we observe that playing the treatment versions of the game induces psychological reactance which, in turn, increases outparty distance and decreases willingness to engage in nonpolitical talk with outparty members but also increases outparty warmth.We speculate on reasons for this unexpected observation in the discussion section.The mediating effect of psychological reactance on willingness to engage in outparty political talk was not significant.Given that we observe small but significant mediating effects of psychological reactance on 3 out of 4 of our affective polarization measures, we conclude that hypothesis H2c is partially supported.

Moderator analyses.
We analyze how party identification, party strength, and political knowledge moderate outparty feelings (RQ2).We report on results from the pre-registered OLS regression model controlling for demographic and game-related variables in Figure 5. 21 We provide the complete set of coefficients from the OLS regression in Table SM10.Since multiple interaction terms are hard to interpret using a regression table, we plot each moderator's mean treatment effect (in feelings therometer degrees) and confidence intervals of the treatment game versions in Figure 5.
Analyzing moderation by party identification, we find that Democrats playing the mixed and fully political versions exhibited outparty feelings that were, on average, respectively 6.58 degrees ( < 0.01) and 5.26 degrees ( < 0.05) warmer than Democrats playing the control version.Republicans playing either treatment version did not reliably exhibit changes to outparty feelings compared to the control version.We examine potential reasons for the heterogeneous effects on Republicans and Democrats in the following section.Comparing the effects of playing the treatment games on strong and weak partisans, none of the differences were statistically significant.Comparing the effects of playing the treatment games on low and high political knowledge players, none of the differences were statistically significant.

Why do Republican participants express significantly warmer outparty feelings and lower social distance than Democrats?
We compare the study sample demographics with data from the nationally-representative ANES survey.We find that 41.57% of sample Republicans indicate that they "Lean Republican" compared to 24.95% of Republicans nationally estimated from the ANES survey.In comparison, 23.57% of sample Given that correcting misperceptions is the primary way through which we reduce affective polarization, we analyze participants' own party and outparty misperceptions based on the initial in-game guesses (in Section 5.3.3) to identify why the game is not effective on Republicans.We summarize how Republican and Democratic participants answered in-game questions on party supporters' views in Table 5.We find that Democratic participants overestimated how conservative Republicans were in about 92% of their answers about Republicans.In contrast, they underestimated how liberal Democrats were in about 57% of their answers about Democrats.On the other hand, Republican participants overestimated how conservative Republicans were in about 92% of their answers about Republicans.At the same time, they underestimated how liberal Democrats were in about 64% of their answers about Democrats.Thus, in ample cases (92%), the game could correct Democrats' misperception that Republicans were extreme conservatives, whereas only in a minority of cases (36%), the game could correct the Republicans' misperception that Democrats were extreme liberals.This difference between the game experiences of Republicans' and Democrats' could have contributed to the differential effects.The natural question that follows is why Republicans exhibit fewer misperceptions than Democrats in this study given that prior studies do not find such differences.

Why do Republicans exhibit fewer misperceptions about the extent to which Democrats are liberal?
We believe that a major reason for Republicans exhibiting fewer misperceptions was the game questions about Democrats that we had selected.Consider the question, "what percentage of Democrats say that transgender people face no discrimination at all in the US?" (from Table SM2).The survey estimate was 1%.For this question, because of the extremity of the survey estimate, only a guess of 0% would imply that the player thought Democrats were more liberal than they actually are, while any guess above 1% would imply that they thought Democrats were more conservative than they actually are.We found four similar Democrat-related questions.For these questions, since the survey estimates indicate that almost all Democrats hold the most liberal positions conceivable, participants cannot misperceive Democrats to be even more liberal.These questions skewed our misperception estimates of Democrats' views, which likely reduced the games' effectiveness on Republicans.We did not find similar issues for Republican-related questions. 22In the question selection process detailed in Section SM1.1, we only selected questions that participants had the most misperception on but did not consider the direction of the misperception.In hindsight, for this particular intervention, we ought to have selected questions for which the survey estimates were not extreme values and questions for which Democrats' views were misperceived to be more liberal and Republican views' were misperceived to be more conservative.

DISCUSSION
6.1 Engaging in politics through games Though we did not observe a main effect on outparty feelings, the moderation analyses suggest that the games might be particularly effective among Democrats (RQ2).Playing the mixed version of the game increased the willingness to engage in political discussions with the outparty (H1c).These findings take on greater importance as prior research suggests that partisans exhibit a strong reluctance to engage with the other side, even on nonpolitical topics [68].Further, these results suggest that this game could be used as a potential ice-breaker activity in local community meetings, participatory planning meetings and citizen forums before participants engage with opposing partisans on substantive issues.Notably, based on our game experience measures (RQ1), participants appear to enjoy playing the treatment game versions at least as much as the nonpolitical control version.This suggests that corrective political information can be incorporated within game settings without negatively impacting the game's fun quotient.Such games can be scaled up to a broader audience by embedding them on social media platforms such as Facebook.Further, these games could complement (or be a precursor) to other interventions that require a deeper engagement with outparty individuals, such as having one-on-one [10] or group discussions [43].As more people show little appetite for politics [34], these games could provide a small dose of politically relevant information packaged in a casual, fun way.
One concern with presenting important political information through a fun casual game is that it could desensitize and trivialize serious political issues [64].Yet, in some ways, a lighter engagement with politics, the kind that the game promotes, might actually benefit most people.Krupnikov and Barry [34] identify the "other divide" in the US based on political engagement, between a small minority of citizens who are "deeply involved" 23 and all others (who are simply in the know about politics or do not follow politics entirely).Most people encounter political interactions casually, at workplaces, social gatherings and on social media.These encounters are likely with the deeply involved partisans who are the most vocal.While these encounters provide a conduit for (biased but nonetheless) political information, they also elicit negative internal comparisons with the deeply involved, resulting in even disengagement from politics altogether.Instead, games such as GuesSync!, with their lighter engagement with politics, could build curiosity and create a positive association with politics which may increase political participation.Indeed, Lerner, in his book [38] on making democracy fun, makes a convincing case for how games and game-like processes, when designed carefully, can increase involvement in the democratic process by making public hearings and community meetings more fun and engaging.
that the police officers never use more force than necessary?".The survey estimate was 3%.Here, any guess from 3%-100% would imply that the player thought Republicans were more conservative than they actually where.There were three such questions. 23The deeply involved are people who (i) spend much time on politics at the cost of other activities, (ii) perceive even mundane political events as significantly important and (iii) are extremely vocal about their political thoughts and opinions.These people also harbor high levels of animosity towards outpartisans.

Role of psychological reactance in attempts to reduce affective polarization
Through the mediation analysis, we find that playing the treatment version games resulted in higher ratings on the psychological reactance scale.Note that the psychological reactance scale measures feelings of being pressured/manipulated/forced to form certain views about Republicans and Democrats.Thus, feelings that could potentially result in psychological reactance were created by playing the game.However, we found its effects on the outcome measures were mixed, reducing the willingness to talk politics with outpartisans and increasing social distance, but also increasing outparty warmth (H2c).It is unclear why there are opposite effects for the different outcomes.One potential reason could be that the feelings thermometer ratings measure a somewhat abstract concept of feelings towards outparty, whereas social distance and willingness to talk politics measure attitudes toward specific scenarios and behaviors.Thus, the feelings of being pressured/manipulated/forced do not translate into psychological reactance when asked about abstract attitudes, but they likely do when asked about engaging with an outpartisan which is perhaps a bridge too far.
Psychological reactance has not been previously tested as a potential mechanism in the context of affective polarization.However, it could be a possible explanation for why some efforts to reduce affective polarization have often yielded relatively modest effects [41,76,77].In one study, Levendusky [41] tested if inducing partisan-ambivalence by asking people what they dislike about their own party and like about the other party could reduce affective polarization.Many participants resisted the task with responses such as 'nothing' and 'are you kidding me?'.While psychological reactance was not formally measured, the responses suggest that it could have been induced as this was a somewhat direct manipulation.The fact that even the mixed version of the game containing little political information triggered a measurable increase in psychological reactance suggests that other approaches could also trigger the same.More research is needed to better understand when psychological reactance is triggered and ways to mitigate it.

Heterogeneous effects of correcting misperceptions about party supporters' political views
Moderation analysis (RQ2, Figure 5) suggests that Democrats playing the treatment games generally exhibited more warmth towards Republican supporters, whereas Republicans playing the treatment games did not reliably exhibit a change in their feelings towards Democrats compared to those playing the control version.Exploratory analyses in Section 5.6 suggests that a major reason for this might be that many of our Democrat-related game questions did not result in correcting misperceptions of Democrats being extremely liberal.Instead, the survey estimates for those questions only reaffirmed that Democrats were extremely liberal in their views.In hindsight, we ought to have selected questions for which the survey estimates were not extreme values.Note that in Section SM1.1, we selected political game questions based on the size of misperception and importance rating.However, we did not factor in the direction of the misperception.For this game, we ought to have considered the direction of misperception and selected questions for which Republicans overestimate how liberal Democrats' views are.However, this surfaces an important conundrum.Do we correct misperceptions about the outparty only on certain views where we know that outparty extremity is exaggerated?By focusing on only issues that are misperceived to be contentious, we might reduce affective polarization.However, we run the risk of players perceiving more common ground than there is, which may dampen political mobilization efforts [22].Moreover, long term, if the game is perceived to only correct certain kinds of misperceptions, players might consider it to be overly manipulative and not return to play again or the game's effectiveness in reducing outparty hostility might be reduced.

LIMITATIONS AND FUTURE WORK
We acknowledge that our study has some limitations in the game's design and the experiment.As discussed earlier, the game questions suppressed misperceptions of Democrats' being more liberal which likely reduced the effectiveness of the game intervention on Republicans.Also, the game always corrects perceptions about both Republican and Democratic supporters' political views, which does not allow us to distinguish between effects due to corrections about inparty and outparty political views.We do not measure how confident players are about their perceptions of party supporters' views during the game.In our question selection process, we selected only the questions on topics that partisans claimed were most important to them, so participants were likely, on average, more misinformed than uninformed about these topics.Nevertheless, we cannot distinguish between the uninformed and the misinformed in this game design.Finally, it is possible that the original misperceptions that participants had about the opposing party supporters continue to shape their attitudes about them even after the in-game correction.This phenomenon, called belief echos, is observed in misperception corrections of factual news [72].More research is needed to see if such belief echoes also exist when correcting others' perceptions.
Our experiment participants were recruited from the Amazon MTurk platform and appear to be disproportionately young, male and white compared to census statistics (We discuss more about our participant sample in Section SM1.4).Thus, it is unclear how the results might differ when the larger public plays the game.Also, the post-game survey was administered immediately after the game, so we do not know how long treatment effects might last.However, participants signaled that if given the opportunity, they would play the game again (Table 4) with different questions.It is likely that if the game were repeatedly played, these effects might hold long-term.Another related limitation would be the number of unique political questions available in the game if the game were to be played long-term.Naturally, we are limited by the topics that partisans misperception of party supporters' views.Unfortunately (or fortunately for the game), because of factors such as partisan media, selective media exposure and motivated reasoning, these misperceptions are likely to remain, if not grow, in the foreseeable future.Thus, we do not expect to run out of questions to ask in the game.In this work, because of resource constraints, we used existing survey data to formulate questions which limited the questions we could ask.If we had more resources, we could run our own nationally representative surveys to create game material for the game.
In terms of concrete next steps for GuesSync!, given that the selected questions about Democrats' policy views did not correct misperceptions, we aim to generate alternate questions for Republicans about Democrats' policy views.Then, we aim to scale up the game and make it available to the general public through social media platforms.More broadly, results from the experiment suggest that people enjoy playing fun and engaging games that may be political.This presents more opportunities to mix politics and play.In GuesSync!, we designed the game to reduce misperceptions about party supporters, an approach known to reduce affective polarization.This is just one of many viable depolarization strategies, such as priming a superordinate identity [40] that could be incorporated within game contexts.In the future, we hope to explore alternate strategies, game mechanics and storylines that can reduce hostile attitudes and behavior.

CONCLUSION
In this work, we present a fun and engaging casual game GuesSync!, which we designed to help reduce affective polarization and increase engagement with outparty supporters.From experimenting with three game versions, we did not find evidence that GuesSync!reliably reduces affective polarization.However, the treatment versions of the game were effective in improving outparty feelings among Democrats.The mixed version was also effective in improving willingness to talk politics with outpartisans.We also identified psychological reactance as a potential mechanism that might affect the effectiveness of depolarization interventions.Finally, our game experience measures show that the two political games were just as fun to play as the nonpolitical game version suggesting that, contrary to popular belief, people do, in fact, like to mix politics and play.

Table 1 .
Outline of measures collected during the experiment

Table 2 .
OLS regression coefficients modeling outparty feelings and social distance

Table 3 .
Ordinal regression coefficients modeling willingness to talk to outparty and game ratings

Table 4 .
Game experience measures by game type

Table 5 .
Misperceptions gauged based on guesses in initial guess phase Lean Democrat" compared to an estimated 24.76% of Democrats from the ANES survey.Thus, Republicans in the study sample are more moderate than the typical Republican in the broader electorate.Also, in the study sample, the average Republican is more moderate than the average Democrat.As weak partisans typically exhibit less outparty hostility, this could be one reason Republicans in our sample, on average, exhibit warmer feelings towards Democrats than vice-versa.5.6.2Why does the game have differential effects on Republicans and Democrats?