CONTENTR: An Experiential Game for Teaching Value Tradeoffs in Social Media Governance

Online content moderation has become the subject of intense debate as policymakers and platform developers aim to balance values such as freedom of expression and community safety. Despite the impact of content moderation on public discourse and online experiences, the debate surrounding content moderation regulation rarely involves those impacted by these decisions. To explore how to engage individuals in learning opportunities that deepen understanding of social and technical aspects of online content moderation, we designed and tested an educational game: CONTENTR. The game gives participants experience debating and making decisions about platform governance. We used the Values at Play (VAP) game design framework to discover and translate social values into game elements, and then verified both values translation and learning outcomes using qualitative feedback from three phases of testing. We found that gameplay facilitated collaborative discussion and decision-making regarding the challenges of designing an online platform for mass appeal. Both tabletop and online versions of the game are available on our project website. Our findings highlight how gameplay can create a deeper understanding of the challenges involved in developing and enforcing online content policies, and challenge participants' pre-existing values and assumptions through both game elements and exposure to other participants' perspectives. We believe the game will be useful in courses ranging from civics and technology policy to information and computer science.


INTRODUCTION: MOTIVATION & GAME OBJECTIVES
The proliferation of online platforms in which any user can be an actor, journalist or opinion writer has irreversibly changed the information environment and made way for the practice of content moderation.In order to engage students and the public in the challenging tradeoffs that underpin platform governance, we designed CONTENTR.This paper will outline our design and evaluation process informed by Flanagan and Nissenbaum's [22] Values at Play (VAP) framework.The educational outcomes and goals were informed by the "discovery" stage of VAP.Content moderation is the set of policies and practices online companies establish and enforce regarding what content users can post [28,45].In recent years, social media content moderation has become the subject of intense policy debate as the public has lost confidence in platforms' ability to keep online communities safe [8,36,37].Across the political spectrum, legislators and advocates agree that new regulation is needed but there remains little consensus on the goals or approach [32].In the United States (US), much of the debate centers around Section 230 of the Communications Decency Act ("CDA 230"), which shields an interactive computer service from liability for content on their platform and their choice to "restrict access to or availability of material that the provider or user considers to be obscene" [1].The range of diverse views on what constitutes obscene or harmful and who should decide, make the 27-year-old law difficult to reform [24,34].
Despite the impact of content moderation on public discourse [11,54], public engagement in the debate surrounding content moderation regulation is understudied.Scholars have analyzed social media users' views of company policies [13,46] and the experiences of social media users who have had their content flagged or removed or had their account suspended [31,40,50,52].Notably, Fan and Zhang [19] convened citizen juries where participants discussed borderline cases of offensive content, finding that collaborative deliberation encouraged authentic conversation.
To explore how we might effectively create learning opportunities to improve student and public understanding of this technogy policy issue, we designed and tested CONTENTR, a game in which participants debate and make decisions about platform governance in a US context.Prior work has explored the potential for games to serve as a form of policy education through what is known as "serious games, " a category of games that have an objective beyond entertainment during which players may learn to plan or navigate tradeoffs [5,18].
Grappling with complexity and value tradeoffs are core tenets of policy development, and researchers have therefore explored serious games to teach policy [10,14,27,42,51].Specifically, simulationbased policy games that include realistic roles and examples increase engagement with learning [6,20,42,51].Compared to a casual political conversation, roles within a game increase the focus on a specific set of values making it easier for players to articulate opinions [10].They can also be used to challenge a player's assumptions by providing an opportunity to explore perspectives other than their own [21,30] and to expand ethical reasoning [47,48].Lastly, when players are able to deliberate within a game it improves their ability to construct a thoughtful argument [35].Within computer science and information science education, serious games are increasingly used to teach computing ethics [9,47] and introduce policy issues in computing [41].

METHOD: VALUES AT PLAY FOR GAME DEVELOPMENT & EVALUATION
The Values at Play (VAP) framework encourages game designers to make intentional choices regarding the values within their game [21,22].The framework includes three steps completed in an iterative cycle: discovery, understanding values relevant to a gaming experience; translation, realizing values though game elements such as roles, narrative and layout; and verification, establishing validity of efforts to discover and translate values [22].After discovery and translation, we conducted verification through three play sessions analyzed using qualitative methods.Specifically, phase 1 analyzed the observation of 4 play sessions (8 players), phase 2 analyzed post-game assessments from 9 play sessions (31 players), and phase 3 analyzed audio recordings from 2 play sessions (6 players).

Discovery & Translation
Flanagan and Nissenbaum [22] encourage game designers to consider a range of societal inputs when discovering values relevant to the gaming experience and goals.Our literature review of social media regulation policy proposals inspired three values we aspired to embed into CONTENTR.First, we sought to educate players about values trade-offs: the challenge of balancing free expression (FE) and community safety (CS).Next, we hoped to build empathy for individuals working in content moderation through emulating moderator roles.And finally, we hoped to inspire collaboration between players.We describe below how we translated these valueoriented goals into game elements including the narrative, roles, decisions, rules for interaction, rewards, and aesthetics [22].
2.1.1Values Trade-offs.A platform's moderation policies represent a set of compromises required to create a site that appeals to a wide range of users [15,23,34].In the case of CDA 230 reform, the challenge of balancing trade-offs between values like free expression and community safety represents a central concept in the policy debate.In practice, content moderation is a "system of administration" that includes automated moderation, labeling, stakeholder engagement, personalization, and design features [16].CONTENTR focuses on decisions to allow or ban real-world examples of content because contemplating binary decisions on difficult content provides a powerful entryway to understanding the impacts and current limits of CDA 230 [24].
The game narrative was our central tool for encouraging players to engage with balancing values trade-offs.Players were told they represent a startup operating within a US policy context that values innovation and competition in the social media sector.Players were then directed to try to balance FE and CS points.The game narration describes CDA 230 as a shield that protects the platform from legal liability for content with some exceptions, as well as a sword that allows platforms to make their own decisions about content moderation [26].
Next, we embedded values trade-offs into the major actions of the game.In the first round, players were asked to set policies to allow or ban particular content types.In the second round, teams then interpreted their policies, making decisions to allow or ban challenging content drawn from real-world examples.In the third round, players gain or lose FE and CS points based on event cards that interact with the decisions the team has made.Points help players monitor their balance of FE and CS.Event cards capture the outside influence that media coverage has on platform policy [34] as well as the impacts of technological factors such as machine learning classifiers.Players keep or lose both FE and CS points depending on how their moderation decisions might play out in the real world.Figure 1 includes an example of how each round builds on the last to challenge players to consider FE and CS.For the yellow policy cards (Round 1) we chose representative real-world community guidelines inspired by a range of companies [2][3][4]24].In Round 1, players sorted yellow cards to create policies for CONTENTR.In Round 2, players use these policies to make decisions about whether to allow or ban the content described in gray content cards.The content cards were inspired by real-life examples of controversial content.To cover a wide range of challenging decisions, we populated the content cards with controversial content types drawn from Gillespie [24] including text descriptions of sexually explicit content, illegal activity, self-harm, graphic content, harassment, hate speech and quality contributions.(Because many of these content types have the potential to be graphic or upsetting, participants were shown written descriptions of the content rather than graphic language or images).Finally, the blue social event cards and purple algorithm event cards change point totals based on players' previous policy and moderation choices.

Emulating Moderation Roles & Collaboration.
Content moderation can be an opaque process [40,50], but both academic literature and media reporting have explored the experiences of the people who shape moderation decisions.This literature illustrates the difficulty of moderation work [17,44], and we hoped to help players experience this difficulty, first as policymakers and then as moderators.Role descriptions were displayed beside the game board and a facilitator read the roles to players at the beginning of the game, encouraging players to "make an intentional effort" to empathize with the roles [21].
Individuals working on a platform's policy team are tasked with writing clear policies that can be enforced by moderators [34,45].Additionally, they are tasked by company leadership to focus on metrics important to the business such as growth and profit [15,23].This experience was captured in the game through the instructions to create a popular platform, and in the final version, players made decisions about how to spend investment dollars on product design features and safety processes.
Commercial content moderators can struggle to make important decisions quickly and understand the context of the content they are reviewing [39,44].Many moderators are physically working in locations far from where the content is created [45].Engaging with disturbing content is emotionally draining [49].Therefore, we included examples of content and events that could trigger emotional reactions.
The game development and testing received exempt status from our institution's review board (UMD IRB 1682807-2).Because game play could trigger emotional responses, at the beginning of each game session we described the content that would be discussed and allowed participants to leave the game at any time.Our team also made ethical decisions about what examples of content to include.We understood that players' cultural backgrounds or personal histories could make specific cards particularly upsetting to discuss.To account for this, we included a wide variety of cards in the hope that no individual player would experience an unfair burden due to their identity and life experience's.
Scholars have also considered the relationship or power dynamic between a platform's policy team and content moderators [34,45].Specifically, commercial content moderators tend to lack autonomy [34,45].In some cases, there are structures in place to move decisions up a chain of command [34] and it is not uncommon for discussions to take place related to borderline content [7,34,45].We simulated this experience through the introduction of nuance cards.At the end of each content card round, players had the opportunity to add nuance to a policy card from the previous round.By editing a specific policy card, players were able to add details to or even change a previous decision made about one or more content card.
Finally, we designed the game narrative and layout to be collaborative to support engagement and learning [53].Each policy and moderation decision was discussed and made as a team, using either consensus or voting.Throughout the game, players could ask questions and the facilitator could ask questions or refocus the group.The final version of the game is self-facilitated.

Verification & Evaluation
To verify that we had successfully translated our goals (educate players on values trade-offs, moderator roles, and learning through collaboration) into game design elements, as recommended by the VAP framework, we facilitated three phases of testing and iteration.Due to the COVID-19 pandemic, we virtualized the tabletop card game using online whiteboard software MURAL.Phase 1 included 4 play sessions with 8 players recruited among adult colleagues and friends.These sessions were casual, with players asking questions and offering feedback in real-time.No formal data was collected, but the authors took notes to track player reactions and difficulties in play.
After this pilot phase, we edited the game's backstory to better define the business context for the start-up and improve clarity in the role descriptions.We also set concrete guidelines for the nature of the collaboration, telling teams to aim for consensus, but to default to a vote after 2-4 minutes of discussion to keep the game moving.Additionally, cards that participants found confusing were cut or replaced.To enable a shorter and longer version of the game, we split the policy card, content card and event card rounds into cycles based on groupings of types of content.Cycle 1 covered harassment and hate speech; cycle 2, sexually explicit content and illegal activity; and cycle 3, self-harm and graphic content.Figure 2 represents the game board after phase 1 trials.

Tradeoffs
Comments related to creating mass appeal by maximizing free expression and community safety, the success of the start-up (CONTENTR) and reward elements of the game.

Emulating Moderation Roles
Comments in which participants identified with or understood the experiences of content moderators or the policy teams

Collaboration
Comments related to discussion among players, ways decisions were made, the layout of the board and the ability to follow the conversation/decision points.
informing the evaluation and game design alterations [29].We analyzed the feedback using a priori analysis to understand the extent to which our values-based goals (values trade-offs, emulating moderation roles, and collaboration) were successfully represented by the game [33].The first two authors independently coded a subset of the responses.As there was initially disagreement between the coders, the two authors met to discuss the codes and come to an agreement.Once we finalized the code book (see Table 1), the first two authors re-coded all the data independently.Phase 2 Results: Only 5 participants made statements suggesting that values trade-offs were captured effectively through game elements.18 participants made comments about the lack of goals or clarity regarding trade-offs and made suggestions for different ways to frame rewards.For example, P28 said, "The possibility of losing rather than just getting the best score would also help the game be improved.I think it would be really fun if you were playing and had to make some decisions you don't necessarily like to try to avoid either the [FE] points or the [CS] points from going down to zero." The majority of participants (23) made statements that suggested that empathy through emulation of moderator roles was successfully translated in the early version of the game.P24 explained the emotional distress that content moderators can experience: "There were very adequate trigger warnings -but the content itself is difficult to experience.That being said, I think that creates a very interesting and insightful experience.It helped me to see how damaging these issues would be on the moderators." While most players expressed empathy, 11 made statements related to elements of the game that made it difficult to feel empathy for the roles.Additionally, facilitators recognized that players in phase 2 trials came to see content moderation as a set of binary choices ("allow"/"ban") and they did not get to engage with the design choices and safety practice investments social media companies must also consider [16].
Lastly, 18 participants mentioned positive aspects to the collaborative elements of the game while 12 participants shared negative experiences.Many players enjoyed the collaborative game play, citing learning from fellow participants.P10 described, "It provoked pointed conversations about the ethics of social media, and I enjoyed hearing the perspectives of my teammates."Some players mentioned that hearing from other players led them to change their beliefs, "I liked learning what other people found offensive and their thought process which sometimes questioned my own beliefs" (P13).However, some players critiqued the voting process explaining, "I would have preferred a consensus format" (P6).
Phase 2 Game Element Improvements: The phase 2 data and game facilitator notes led to the following changes (also displayed in Figure 3): • Switched the final game outcome from total FE/CS points to a monetary balance sheet.• Added investment cards to the beginning of each cycle to simulate a growing startup.• Added growth policy cards which allowed players to make choices regarding investing in bot detection, the types of algorithms used to recommend content, and fact checking and labeling systems.• Recommend the game be played with 3 players to make it easier to reach consensus and decrease play time.• Rearranged the board to make it easier for players to see the decisions they made during round 1 • Rearranged the cycles to more closely mimic a startup social media company's growth: placing sexually explicit content and illegal activity first as these are the types of content social media companies are required to remove in today's legal context.We hosted 2 sessions, each with 3 participants on two Saturday mornings in June 2021.Each participant was paid $30 for their time.Recruitment consisted of emailing listservs affiliated with our university and through Meetup communities.Each participant completed the same exit survey completed by the phase two participants.Each trial was recorded, transcribed and coded using the code book provided in Table 1.Because we wanted to understand how each of the game goals (values trade-offs, emulating moderation roles, and collaboration) manifested through play, we used thematic analysis to identify sub-themes in the data [12].
Phase 3 Evaluation Results: Three sub-themes emerged as players discussed balancing values trade-offs: user autonomy, inclusiveness, and responsibility of online platforms to be stewards of information.Several players emphasized the importance of user autonomy.For example, P106 expressed wanting to give users choice regarding their online experience: "I think this type of content [grey content cards] should be fine, but people should give consent in order to be confronted with it in the most mild or aggressive sense." Players also thought about inclusiveness and making CONTENTR a safe place for all people.When discussing content card 14 (extremely thin woman using a hashtag that may indicate pro-eating disorder content), P103 explained, "I feel like banning it might be a form of body shaming, and some people just are really thin." Players also discussed their responsibility as intermediaries of information including the importance of news availability and concerns regarding who should make choices about what is true: "fact checkers can pick things out of context.I've seen it happen a lot" (P106).
Comments also revealed that players were no longer confused about their goal to balance FE and CS, instead they debated causality: players questioned if it is fair to make the conclusion that the content on a social media platform could be directly attributed to harms.We found this conversation to be productive and important for learning about platform governance challenges.
Phase 3 trials also underscored the ways CONTENTR captures the challenges associated with working in content moderation.Players successfully felt the weight of being decision-makers.P102 reflected, "there are a lot of value judgments being made . . .content moderation and regulation on platforms, is really about values to that extent."Additionally, when discussing policy card 18 (terrorism) P106 explained, "we could potentially be oppressed more than ever, by not only our own governments, but also platform governance." Lastly, when discussing content card 29 (MAGA hats turned into swastikas as an art project) P101 said, "I feel like there's no right or wrong answer to this."While playing the policy role, players considered company growth.When discussing the choice to spend a large sum to fight bots, P102 argued "So we might as well get a handle on the problem early before bots spread," and P103 responded, "I guess, for me, it'll depend a little bit about what the bots are like, are the bots violating any policies?Or are they just there?And if there's no data on how many bots there are, I'm just not sure if I want to spend $5 million." The collaborative nature of the game allowed for sharing of experiences and challenged each players' personal viewpoints.In survey results from Phase 3, five out of six respondents mentioned that interacting with the other players was their favorite part of the game.Collaboration allowed for knowledge sharing regarding content card context.For example, when discussing content card 17 (Pepe the Frog), P105 asked "who is Pepe the Frog?" and P106 was able to explain how Pepe the Frog began as a harmless cartoon and morphed into a hate symbol over time [25].Discussions during Phase 3 sessions also led to players changing their mind.It was not uncommon to hear comments like "I'm glad I wasn't the only player because. . .I would have made, probably different decisions." (P102) Finally, collaborative discussions also allowed for players to share personal experiences.For example, when discussing content card 26 (which contained a Sinophobic slur for COVID) P105 shared, "I live in [large American city] and the fact that anti-China sentiment like this leads to anti-Asian sentiment which affects billions of people." Phase 3 Game Element Improvements: Phase 3 results suggested that the game successfully engaged players in considering values trade-offs while building empathy for moderation roles and supporting collaborative learning.However, running trials in which players play all three cycles in one sitting suggested process changes to cut down on the time needed for the game.We altered the game instructions to prompt players to only discuss cards when there are disagreements among players, as opposed to prompting players to discuss their choices throughout.We also edited play to include nuance cards only once per round and challenged the players to consider the most important change they wanted to make.Finally, we updated the presentation of the final score to allow players to assess the meaning of the final balance sheet, adding nuance to the ways players can "win" the game.With these changes, we finalized the game.The game became a successful central activity for an educational virtual Citizen Panel on Section 230 reform in August 2021.

TABLETOP CARD GAME & FUTURE WORK
Our final design stage increased the accessibility of the game by creating a tabletop version that did not require virtual whiteboard access or a facilitator (cards can be maneuvered on a physical flat surface as seen in Figure 4).We created PDF cards, a layout that would work for in-person play, and a classroom instruction guide.Once the tabletop cards, scorecard and instruction guide was ready, we tested them for self-guided group play in an undergraduate course on information policy.These play sessions helped ensure that the instruction guide was comprehensible by undergraduates.Finally, we are finalizing an online version of the game mimicking the tabletop version that can be played individually or as a group.Both the PDF cards, classroom instruction guide and virtual game are freely available on our website.

DISCUSSION & CONCLUSION
Through iterative cycles of testing with student participants, we found that CONTENTR successfully encouraged participants to collaborate and discuss the challenges of balancing community safety and free expression.CONTENTR also created a deeper understanding of the challenges involved in developing and enforcing content policies and challenged participants' pre-existing values and assumptions through both game elements and exposure to other participants' perspectives.

Balancing FE & CS
Our evaluation suggests that playing the role of a startup social media policy team and content moderator in a collaborative setting created a space for players to experience the challenge of balancing values like free expression and community safety, as well as a space to discuss values such as autonomy, inclusion, and truthfulness.Specifically, players learned that consistently applying high-level, Figure 4: An example of how the beginning of a round played on a physical table top may look relatively context-free policies is challenging and that content moderation decisions often depend on nuanced contextual detail.This idea was expressed through the players' frustration with only having one nuance card per round.Respecting contextual nuance is a challenge in content moderation and has led mature social media companies to create "booklets" or a very specific detailed set of instructions that go well beyond the policies outlined in community guidelines simulated through policy cards in CONTENTR [34].
Additionally, players experienced the idea that there is "no winning" as they watched the event cards result in loss of points regardless of how thoughtful they were in their decision making: there was almost always a stakeholder group upset by content moderation.This experience mirrors the experience of online platforms today as they grasp for (untenable) forms of political neutrality [24].

Empathy through Emulation
By capturing aspects of content moderation roles including the challenge that comes with being a decision maker, CONTENTR provided players with a novel experience.First, many participants noticed the emotional exhaustion that can come with this type of work.Additionally, players struggled to understand the context of certain cards and had to ask questions or acknowledge they did not know what a reference meant.Lastly, throughout the game trials, players expressed frustration about not being able to change features on the platform related to labels, filters, or recommendation systems.The frustration that lack of decision-making power causes for people working in content moderation is captured in Robert's work and in documentation like the "Facebook Papers," where it is evident that employees on Meta's trust and safety team were making recommendations disregarded by leadership [38,45].

Collaboration Between Players
Discussions allowed for players to see the interaction between policies and values in a way that solo decision-making would have struggled to capture.Throughout the game, players listened to the other participants and changed or expanded their views.
A final theme that arose throughout game play was the role of the players' identity in their content moderation decisions.It is not a surprise that someone's experience in the world would influence the type of content they find more or less permissible.More established social media companies account for this by training content moderators extensively in the nuances of their community standards [34,45].The experience simulated in CONTENTR represents a startup in the early phases of developing policies.As an educational tool, we wanted the game to facilitate thoughtful discussion among participants and leave space for sharing personal experiences that shaped their reasoning for a specific decision [19,43].Evaluation of the game over three rounds shows the game is successful in meeting these goals.We hope that this game will be useful to educators in civics and policy, computer and information science, and those working to engage the public in critical questions around how content is best regulated online.

Figure 1 :
Figure 1: An example of cards from each round related to sexually explicit content.
2.2.1 Phase 1 (Pilot) Trials.Phase 1 trials occurred in the Summer of 2020 and were analyzed through observational methods.

Figure 2 :
Figure 2: Layout of the game board used for Phase 2 trials

Figure 3 :
Figure 3: Layout of the game board used for Phase 3 trials

Table 1 :
Codebook for Qualitative Data