See Widely, Think Wisely: Toward Designing a Generative Multi-agent System to Burst Filter Bubbles

The proliferation of AI-powered search and recommendation systems has accelerated the formation of “filter bubbles” that reinforce people’s biases and narrow their perspectives. Previous research has attempted to address this issue by increasing the diversity of information exposure, which is often hindered by a lack of user motivation to engage with. In this study, we took a human-centered approach to explore how Large Language Models (LLMs) could assist users in embracing more diverse perspectives. We developed a prototype featuring LLM-powered multi-agent characters that users could interact with while reading social media content. We conducted a participatory design study with 18 participants and found that multi-agent dialogues with gamification incentives could motivate users to engage with opposing viewpoints. Additionally, progressive interactions with assessment tasks could promote thoughtful consideration. Based on these findings, we provided design implications with future work outlooks for leveraging LLMs to help users burst their filter bubbles.


INTRODUCTION
The boom of the PC and mobile internet has revolutionized the way information is consumed and disseminated.AI-powered search and recommendation systems are now a common feature on social media, news, and streaming platforms, analyzing users' behavior, preferences, and interactions to provide personalized content.While these systems have greatly improved users' experience, they can also exacerbate a phenomenon known as the "flter bubble" or "information cocoon", where individuals tend to consume more information that confrms their existing beliefs, potentially narrowing their perspectives and reinforcing their potential biases.
Figure 1: The interfaces and the interaction fow of our prototype: users can interact with LLM-powered multi-agent characters while reading social media content.
To address that issue, researchers have proposed and studied three primary approaches, including optimizing recommendation algorithms, expanding users' exposure to diverse information, and nudging them towards it.Recommendations algorithms were designed to take more information into account, such as inter-item correlation [70], user profle [30], social network information [60], and diversity of the recommendations [39].Researchers also explored to expose users to a variety of information from other users in content platform [6], or news from agencies [51] or users [28] with diferent ideological standings.Other explorations include presenting the credibility of content [7], the reason of seeing a particular article [64], and visualization of users' political leanings [48] to nudge users to broaden their content consumption range and refect on the information they engage with.However, previous approaches tend to focus on increasing the diversity of information and content exposure without sufciently taking into account one of the most important elements in this process: the user.Simply providing users with diverse perspectives is not sufcient by itself [56], as it requires both the user's willingness to explore and the capacity for in-depth processing of the content to truly move the needle.Therefore, it is crucial for users to discover, interact with, and refect on diverse perspectives outside of their existing flter bubbles in order to efectively burst them.
Helping people deal with "flter bubble" is challenging for two main reasons.First, the quantity, quality, and diversity of perspectives are highly dependent on user-generated content (UGC) on online platforms.UGC may carry over their creators' potential biases on a particular topic, which could further afect the availability of relevant perspectives to other users.Second, motivating users to engage with and deeply think about diverse perspectives requires a system to continuously understand dialogues with users, and provide instant, interactive, and inspiring feedback [32], which was difcult to accomplish with previous Natural Language Processing technologies [50].
Recent advancements in Large Language Models (LLMs) might provide opportunity to help overcome these challenges.These generative models possess the capability to efectively simulate a diverse array of viewpoints, personas, and expertise in a given domain [11,41,54].Additionally, LLMs have been explored and showed promising results in engaging users in continuous dialogues and promoting in-depth thinking in various interactive scenarios, such as fostering meaningful conversations between teachers and students [61], as well as doctors and patients [58], in schools and hospitals [45].
Inspired by LLMs' profciency in generating contextually relevant text [42,44], we proposed to utilize GPT-4 to engage users in meaningful multi-round dialogues, in order to encourage them to contemplate perspectives beyond their own flter bubbles, rather than merely presenting them with diverse viewpoints.However, it remains unknown how such an LLM-powered system should be designed and whether and how such a system may help users access and refect on diverse information, which are our two research questions (RQs).
To answer the frst RQ, we adopted a human-centered approach and conducted a three-hour design workshop attended by a diverse group of participants, including HCI and UX researchers, designers, and psychologists, all of whom are also users of online content platforms.The workshop aimed to generate design ideas to address the research question.Following the workshop, we established three key design considerations around how to provide diverse perspectives, foster deliberate and critical thinking, and motivate user engagement.Based on these considerations, we designed interaction features that leverage LLM-powered multi-agent characters, a frictionless and progressive interaction fow, and gamifcation design to motivate users to interact with diverse perspectives and engage in thoughtful consideration while reading social media content.
To answer the second RQ, we developed a prototype to incorporate the aforementioned interaction features and conducted a user study with 18 participants selected from online content platform users.During the study, they were asked to participate in a range of activities including reading posts and comments and interacting with multi-agent characters within our prototype.Both quantitative and qualitative methods were employed to assess the participants' levels of engagement and the depth of their information processing.Results showed that participants exhibited an inclination to engage with unexpected viewpoints when incorporated into human-like dialogues and enhanced by gamifcation incentives.This engagement, coupled with progressive assessment tasks, enriched their understanding and stimulated deeper refection across a broader range of perspectives.In sum, our work made the following contributions: • We have identifed three crucial design considerations for bursting flter bubbles through a participatory design workshop; • We have designed and developed a prototype with interaction features to promote deeper engagement and critical thinking in diverse information; • We have carried out an empirical laboratory study that evaluated the efcacy of these design considerations and features and present key design implications to guide future practice in assisting users to burst flter bubbles with Large Language Models.

RELATED WORK 2.1 Information Processing of Human Beings
Human information processing comprises both bottom-up and topdown processing, with the latter being infuenced by an individual's prior knowledge and experience [69].People tend to favor information that conforms to their personal beliefs, while feel reluctant to interact with information that contradicts them [49].This inclination, known as confrmation bias, hinders the efect of providing diverse viewpoints to users, as they may ignore or dismiss such information, preferring to remain within their own "flter bubbles." The presence of confrmation bias suggests that merely providing diverse information isn't enough to break flter bubbles [56].A higher level of information processing, such as semantic processing, is required as opposed to other shallow processing [16].In fact, persuasive information that sharply contrasts with one's beliefs, resulting in cognitive dissonance for the recipient, can trigger a change in attitude in the opposite direction of what was intended, a phenomenon known as the boomerang efect [15].Therefore, persuaders should consider applying persuasive techniques, that is to frame the messages in a way that minimizes cognitive dissonance and resistance such as acknowledging shared beliefs and establishing empathetic connection [59], to make the message more relatable and less disconcerting.
Taking into consideration the factors that infuence human information processing, we could efectively "nudge" human behaviors [62].The human information processing theory suggests that information processing is driven by both epistemic and social motivations [25].In terms of epistemic motivation, nudges such as feedback nudges, reinforcement nudges, friction nudges have been designed [5].On the other hand, social motivation also drives the kind of information human attend to, encode, and retrieve, and therefore have an impact on human behavior.Research has shown that the surrounding context of an online discussion (e.g., exposure to previous trolling behavior) could intensify trolling behavior among users [13].Furthermore, humanlike embodiment, that is the presentation of an embodied agent, could improve user motivation [47].
Of note is that, gamifcation, which refers to the use of game design elements in non-game contexts [22], with points, leaderboards, and badges as the most commonly used elements [33], has become a popular method for motivating desired behaviors, especially during information processing.The most frequent implementations of gamifcation has been observed in the context of education or learning [33,38].Studies in education/learning contexts generally found that the learning outcomes of gamifcation were positive, notably through enhancing motivation and engagement in learning tasks, as well as increasing enjoyment over them [46,77].Researcher pointed out that for gamifcation to be efective, the entity being gamifed should already possess intrinsic value.This gives people a good reason to interact with it.For instance, gamifcation would not beneft a news site that lacks fresh content [21].Contextual factors such as the voluntariness of participation, the nature of the system (utilitarian or hedonic), and user motivation (cognitive or afective), play an essential role in shaping user interaction behavior with gamifed systems [21].Consequently, when assessing the impact of gamifcation on a specifc system, it is important to consider these contextual factors.

Current Approaches to Address Filter Bubbles
Research suggested that information fltering occurs at the individual, social, and technological levels [27].While cognitive mechanisms, such as confrmation bias, contributed to the flter bubble efect, social factors, such as homogeneous social networks, and technological mechanisms, like recommendation systems that flter information based on predicted user preferences, further exacerbate this efect.Prior work in the HCI feld sought to deal with the flter bubble problem mainly focused on optimizing the design of recommender systems [1,60,70], displaying more diverse information to the users [6,8,28,48,51] and providing more information rather than merely the content itself [7,64].
In the research area of recommender systems, statistical models have been leveraged to take inter-item correlation into consideration to generate diverse recommendations on YouTube [70].Grossetti et al. found that the recommendations generated by recommendation systems could deviate from users' community profles, contributing to the formation of a flter bubble for some users.To mitigate this efect, they advocated for a re-ranking strategy that utilized users' community profles and the community network to reduce the flter bubble efect [30].Sheth et al. investigated the efectiveness of recommendations from users of diferent social groups that are not connected [60].Adomavicius and Kwon took the content that is less popular into consideration [1].Pyrorank, a novel re-ranking algorithm, was developed to enhance the diversity of recommendations while preserving prediction accuracy [39].Their experimental results show that, at the expense of a negligible accuracy loss, recommendation ranking approaches represent efective techniques for obtaining the desired levels of diversity.
Displaying more diverse information to the users was also proved to be efective.Pop is a Google Chrome extension that augments users' Twitter feeds with news tweets from agencies of diferent ideological standings [51].However, no empirical studies have been done to evaluate its usage and efectiveness.OtherTube displays videos recommended based on other users' watching history [6].Social Mirror promotes viewpoints from a user with diferent political ideologies on Twitter.The user study result indicated that recommending accounts of the opposite political ideology to follow reduces participants' beliefs in the political homogeneity of their network connections [28].
Prior work also investigated providing more information rather than merely the content itself.NudgeCred is a browser extension for Twitter that provides information on the credibility of the content the user consumed [7].It is a collection of three nudges: reliable, questionable, and unreliable.A controlled experiment indicated that NudgeCred signifcantly helped users distinguish news tweets' credibility.Bubble Trouble emphasized transparency in the news curation process by providing users with the capability to fnd out why they are seeing a particular article [64].It specifes the topic category of the user's choice the article falls under as well as the criteria used to measure the relevance of the story.Munson et al. found that users would display a modest inclination towards more balanced information consumption if provided with feedback about their political leanings based on their reading habits [48].
However, the diverse content and other information provided were not interactive for users.Wood et al. built an app called Newsr, which incorporates a co-annotation feature allowing users to create grafti-like annotations on online news articles [72].They found this approach efectively facilitated users to broaden the range of news content they read and enable more directed critique of online news stories, which suggests the importance of interactivity in engaging users with the information provided.Yet, user-generated content tends to lag in terms of real-time relevance, which means even when users want to know more information, they can not directly ask the interface to provide the specifc information they want.Thus, we were inspired to investigate the design space of bursting the flter bubble through interactive information providers.

Capabilities of LLMs to Emulate Multi-agent Characters
Recent advancements in Large Language Models (LLMs) have unlocked unlimited possibilities for real-time generation of highquality content that is virtually indistinguishable from humancreated content.LLMs demonstrated dialogue profciency in generating content that is relevant, appropriate, and creatively diverse, often matching or slightly exceeding human-crafted benchmarks [42,44].One of the primary breakthroughs, the few-shot learning mechanism, enables LLMs to learn and generate content based on only a few examples and generate diverse content across multiple domains, highlighting its versatility and adaptability to diverse roles and contexts without the need for explicit task-specifc training [11,67].By constructing a reward model derived from human feedback, the AI agents were trained to mimic human-like decisionmaking processes and became more closely aligned with human expectations, enhancing their emulation quality and making their responses more reliable [29,52].The mechanics of implicit gradient descent enable real-time adaptability for LLMs, enhancing their ability to learn and improve through in-context learning, allowing them to quickly adopt versatility and assume diverse character roles [3,17].
The art of prompting has emerged as a pivotal technique to guide LLMs efectively generating contextually relevant and characterspecifc responses in real-time [43].These mechanisms and methodologies collectively empower LLMs in their extraordinary content generation capabilities, which serve as the cornerstone of a noteworthy area in recent research: using LLMs to emulate multi-agent characters.Researchers have shown the potential of LLMs to simulate human-like behavior in a variety of real-world scenarios, ranging from performing daily activities to engaging in artistic endeavors [53].Interestingly, signs of strategic behaviors were found to emerge in LLMs when engaged in communication games, even without tuning the parameters [75].The adaptability of LLMs in diverse settings was further exhibited when they were placed in open-world environments like Minecraft, where they navigated complex terrains and interacted with their environment using textbased interactions, showcasing their capability to handle a variety of situations [79].In such LLM-powered multi-agent societies, the role-playing approach has been examined with promising results to efectively guide these communicative agents toward solving complex tasks autonomously [41].
As we delve deeper into domain expertise, there is a growing trend towards instructing these models to emulate expert personas.For example, researchers have showcased how GPT-3.5 could emulate expert roles and facilitate multi-agent collaboration, streamlining the software development process [54].And in the realm of chip design, LLMs were applied in generating Hardware Description Languages from natural language prompts, highlighting its broad applicability with expert knowledge [9].Xu et al. proposed the "Ex-pertPrompting" method to strategically crafts prompts to maintain context and providing coherent responses over extended interactions, allowing LLMs to respond as specialized experts [73].In sum, previous research has demonstrated that LLMs have the ability to emulate convincing, knowledgeable multi-agent characters with a wide range of expertise.Our work builds upon this foundation by utilizing LLMs to generate diverse and consistent viewpoints for each character.

PROTOTYPE DESIGN
To answer the frst RQ, we frst conducted a design workshop to derive design considerations (DCs) to guide the design of an LLM-powered multi-agent system (Figure 2).Based on the DCs, we defned corresponding interaction features and incorporated them into the LLM-powered multi-agent system design.

Design Workshop
We frst conducted a three-hour design workshop that brought together a multidisciplinary team consisting of three HCI and UX researchers, two designers, and two psychologists (referred as S1-S7 hereafter).All participants were also users of online content platforms.
During the workshop, participants were initially briefed on the concept of the "flter bubble efect" on social media and the proposal to use an LLM-powered system to help users refect on diverse viewpoints.The target audience includes all online content consumers, regardless of their awareness of their position within flter bubbles.The goal of the workshop was to engage the participants in brainstorming the design of such a system, drawing upon their professional expertise as well as their personal experiences as social media users.
The workshop was structured into two sessions, each lasting approximately 1.5 hours.The frst session is to discuss about the interaction fow, such as how to provide relevant information to users and how to encourage user refection.The second session focused on the interaction format, such as visual design style, interface layout, etc.In both sessions, participants were also encouraged to identify potential issues that could arise during user interactions and to propose any possible solutions they could conceive.Each session consisted of three parts: We recorded the entire workshop and transcribed it.We also retained the sketches and idea cards drawn by participants during the workshop.Subsequently, two HCI researchers independently coded the transcripts and sketches.They organized the data into a table with columns that encompassed potential issues, proposed solution ideas, as well as the advantages and disadvantages of the proposed solutions.They then discussed their codings until a consensus was reached.Based on these discussion, we derived design considerations and designed our prototype accordingly.

Design Considerations
During the workshop, one intriguing design concept surfaced and explored extensively was leveraging LLMs to anthropomorphize multiple AI agents (referred as multi-agent hereafter), that is, "generating vivid human-like characters with distinctive perspectives" (S1).Integrating these personalities could "foster user empathy towards the AI agents, thereby facilitating a deeper comprehension of the diverse ideas" (S1).
Expanding upon this concept further, what design techniques could be applied to ease resistance and nurture refection on differing ideas was also discussed in the workshop.Participants contributed ideas such as "structuring discussions in a way that incrementally introduces alternative views might alleviate the discomfort often associated with encountering opposing perspectives" (S3), and "incorporating fun reward mechanics that promote active engagement with a sense of accomplishment" (S4).
Another recurring point was the inherent confict between engaging users in consuming diverse content and promoting deep thinking.For example, a "frictionless interaction fow with minimal cognitive load is desirable for encouraging users to view more content, yet this approach may predispose users to superfcially process information" (S1).Similarly, in terms of visual design, a "thoughtful visual style might prompt users to process information more seriously" (S4) but could also "diminish their willingness to use the system" (S6).
Based on these fndings, we derived the following three design considerations (DCs) that an LLM-powered system should address.

DC1:
Providing Diverse Perspectives through Multi-agent Characters.In order to assist users in breaking out of their flter bubbles, the AI agents in the system should ofer a wide and comprehensive range of perspectives.To achieve this, their persona (including age, gender, education level, profession, etc.) and their attitudes toward the topic should be sufciently diverse, allowing users to be exposed to a rich variety of characters and viewpoints."The personas created should be detailed...This not only ensures better prompting outcomes for GPT but also results in more vivid character representations." (S2) Moreover, it is recommended that we present a holistic view of the perspectives, enabling users to easily grasp the full picture of the viewpoints.In doing so, we could "reduce users' cognitive load" (S6) by summarizing the information for them, while still preserving the richness of the content.

DC2
: Fostering Deliberate and Critical Thinking through Progressive Interaction and Assessment Tasks.Simply presenting users with a range of perspectives does not ensure that the information will be efectively absorbed.It is equally crucial to steer users towards more deliberate contemplation.As humans' natural propensity to focus only on content that aligns with their pre-existing beliefs, it is recommended to introduce them gently to contrasting views."When people use social media, encountering completely opposite opinions can be hard to accept and may even elicit anger" (S2).Thus, when AI agents present their viewpoints, we should prompt the LLMs to employ persuasive techniques.This approach aims to prevent the onset of cognitive dissonance, which could cause users to cling even more frmly to their existing beliefs.
Furthermore, we could incorporate assessment tasks to steer users toward a deeper "semantic processing, an indicator of deep processing" (S1) of the perspectives.Through feedback from these tasks, users could also check their comprehension of the presented views.

DC3:
Motivating User Engagement through Natural Interaction and Gamification Design.Viewing and and refecting on opposing viewpoints is not a natural inclination for humans.They may struggle to stay focused and wish to shift to tasks that require less mental efort."People are typically not primed for deep processing of information on social media; hence, it's crucial for the system to be engaging." (S4) As a result, it is imperative that we utilize design to encourage users to interact with perspectives that challenges their own beliefs.Firstly, natural and frictionless interaction design that do not disturb the user's intended browsing experience is advised.Secondly, minimizing the cognitive demands required during their interaction with the system.Third, we could employ gamifcation incentives to motivate users to prolong their exploration within the system.

Prototype Features
Based on these design considerations, the design architecture of our prototype to resemble a mainstream text-based forum, aiming to simulate the user experience of browsing online media while minimizing distractions.More specifcally, our prototype incorporates fve core interaction features (Figure 3) that could potentially beneft navigating users out of their flter bubbles.

LLM-powered
Multi-agent Characters.Our design incorporates multi-agent characters with diverse perspectives generated by state-of-the-art Large Language Model GPT-4.Each character has a realistic background in terms of gender, age, occupation, and education.To enhance the sense of realism, each character is represented by an avatar that conforms to their portrait, aiming to provide users with the impression of talking to a real person.These avatars are displayed in the avatar panel at the top to encourage users to explore other characters and their perspectives.Upon selection, a character overview is presented to facilitate a better understanding of the characters (Figure 3a).

Frictionless Interaction Flow.
To ensure a smooth and natural interaction, we integrated the primary entrance to initiate a dialogue with multi-agent characters directly within the comments section of the primary forum-like interface.This layout maintains users' attention within the same visual feld while reading posts and reduces the disruption caused by switching between diferent areas.Furthermore, we provided default response options generated by LLM during the conversation (Figure 3b).These options, including seeking clarifcation or elaboration on viewpoints in greater detail, serve as a user-friendly guide, reducing the cognitive burden by minimizing the need for active input.

Viewpoints Jigsaw Puzzle.
To encourage users to engage with diverse perspectives, we have introduced a novel feature called the Viewpoints Jigsaw Puzzle (hereinafter referred to as the "Viewpoints Puzzle").This feature runs parallel to the dialogue window and is designed to follow the reward mechanisms of games.As users continue to interact with the AI agent in dialogues, and a progress indicator for dialogue rounds has been added to the top of the dialogue window to encourage further conversation.When the conversation lasts for fve or more rounds, the user is encouraged to explore more viewpoints by "lighting up" other avatars, which consists of fve pieces, each representing a diferent character's viewpoint.Users are encouraged to "light up" all pieces of the Puzzle by interacting with all characters with the required level of engagement (Figure 3c).

Progressive Viewpoints Sequence.
To prevent users from becoming overwhelmed by an excessive number of viewpoints, we present various perspectives gradually.Initially, only one perspective is displayed in the entrance, with the option to expand additional perspectives if the user desires to do so.Each click reveals an additional AI agent along with their opinion, allowing for a more gradual and incremental understanding of the content.Furthermore, we programmed the sequence of presenting each character with attitudes from negative (mainstream attitudes in the posts) to positive, with the intention to facilitate a progressive understanding of the difering viewpoints.Specifcally, we begin by presenting characters whose viewpoints are similar to the existing beliefs, and gradually introduce characters with increasingly contrasting viewpoints (Figure 3d).

Assessment
Task with Multi-choice Qestions.In addition to the gamifcation design, we have also incorporated multiple-choice questions on the Puzzle interface as a special assessment task.This task provides users with an opportunity to self-evaluate their understanding of the viewpoints they have interacted with.When the questions related to a particular viewpoint are answered correctly, it indicates that the user has grasped the concept, and a piece of the puzzle will be illuminated.Once the user has successfully completed all the assessment tasks related to all characters, the entire puzzle will be illuminated, symbolizing the bursting of the flter bubble and the acquisition of more comprehensive and diverse information (Figure 3e).By combining this assessment task with the gamifcation incentives, we aim to encourage continuous engagement, motivate thoughtful consideration, and deepen users' understanding of diferent perspectives.

USER STUDY
To answer the second RQ, we developed a prototype with all the interaction features identifed in the participatory design study.Then we evaluated users' attitude for communicating with AI agents while viewing online posts using our prototype, as well as the effectiveness of LLM-generated opinions on the depth and diversity of users' information-seeking results, through a user study with experienced social media and online forum users.This study was approved by the Institutional Review Board.

Participants
We recruited 18 participants (9 female, 9 male, aged 21-32, referred as P1-P18 hereafter) through word-of-mouth and snowball sampling.All participants have more than fve years of experience in viewing posts on social media, and all have experience in using Large Language Model chatbots (e.g., ChatGPT).Participants were compensated $25 for an approximately 60-minute session.

Materials
We gathered posts regarding the "delayed retirement policy" from the internet.The "delayed retirement policy" is designed to incrementally increase the retirement age, addressing the nation's aging population and associated economic challenges.This topic was selected for the following reasons: • It was a topic that had garnered widespread attention and discussion on the internet at the moment when the experiment was conducted.
• The policy has a signifcant impact on a broad demographics, especially the younger generation as the policy is intended to be implemented progressively to allow for societal adaptation.
• The public opinions in online discussions about this policy were predominantly skewed, marked by widespread concern and discontent regarding the extension of working years and delayed pension benefts [31,76].
Alongside, we prompted GPT-4 to generate fve AI agents endowed with detailed and comprehensive personas and perspectives on this subject (Table 1).

Procedure
Participants were frst informed of the aim of this study and signed a consent form.Experimenters then introduced the key features of the system and demonstrated their usage.Participants were asked to view posts under the topic of "retirement policy" and communicate with AI agents using our system, which was deployed as a web application, on a laptop for around 30 minutes.After fnishing the viewing task, participants were asked to rate their experience of using this prototype on a set of 5-point Likert scales.The experimenters then conducted a semi-structured interview based on the results and observed use patterns.

Thematic Analysis
All study sessions were recorded and transcribed.Two authors read through the text script of three randomly selected participants together to understand their user experience of the prototype.Then, they independently coded the script using an open-coding approach [10].They combined deductive and inductive coding techniques to form the codebook.The two coders regularly discussed the codes and resolved disagreements to create a consolidated codebook.Further meetings were scheduled with the whole research team to discuss the codes and how they should be grouped into themes.The whole team iterated on the codes and their grouping until they reached consensus.In the end, we arrived at four themes: overall user behavioral patterns, engagement, diverse information, and in-depth information processing.

RESULTS
In this section, we frst outline the behavioral patterns of users and their perceptions of the system.Then we discuss the fndings according to our three design considerations.

User Behavioral Patterns and System Perceptions
We frst examined the behavioral patterns of the participant interactions.During the prototype testing, participants generally began by viewing some posts, followed by interacting with the AI agents and exploring the perspective Puzzle.Based on the sequence in which participants engaged with the AI agents, we classifed them into three categories (Figure 4): seven (out of 18) participants initially chose to chat with AI agents based on their own interests (Figure 4a, interest-driven conversation order), three began by following the order presented in the system (including the order displayed in the entrance, the Viewpoints Puzzle, or the avatar panel) but circled back to engage with agents of their interest (Figure 4b, systemguided followed by interest-driven conversation order), and the remaining eight followed the system's presenting order (Figure 4c, system-guided conversation order).Examples of conversation logs from three participants representing each category are provided in Appendix B. 13 participants interacted with all the AI agents.One participant (P16) exhibited a unique behavior pattern that he chose to interact with two AI agents simultaneously, alternating between them and asking each to consider the perspective of the other.The post-survey indicated mixed feedback among participants regarding the system.Among the 18 participants, 14 considered the system interesting, as indicated by their ratings of agree/strongly agree, and the other 4 rated it neutral (Mean = 3.83, SD = 0.51).11 participants reported positive user experience, with ratings of agree/strongly agree, 6 rated neutral, and only 1 gave negative rating of disagree/strongly disagree (Mean = 3.56, SD = 0.62).When asked about their willingness to use the system in the future, 11 participants expressed positive attitudes, rating agree/strongly agree, 5 rated neutral, and 2 rated negatively as disagree/strongly disagree (Mean = 3.50, SD = 0.71).These ratings indicate that, overall, participants' attitudes towards the system lean positive, though not without reservations and concerns: The favorable ratings primarily stemmed from the incorporation of multi-agent characters generated by LLMs and gamifcation design.Participants valued its novelty, describing it as "fun character design" (P3), "enhanced conversational experience similar to role-playing games" (P7), and "more engaging than regular social media browsing" (P8).The neutral and few negative ratings also suggested that for some participants, the system did not fully meet their expectations.Some perceived it as "not as efective as talking to a real person" (P12), and expressed concerns like "feels like taking a reading comprehension test when answering those questions" (P10).
Regarding conversations with AI agents, 12 out of 18 participants rated the conversational fow appropriate and smooth, with ratings of agree/strongly agree, 3 rated neutral, and 3 gave negative ratings of disagree/strongly disagree (Mean = 3.61, SD = 1.09).Participants in favor of the conversational fow praised the ability of LLMs to "understand the context and generate abundant content accordingly" (P16), as well as the design of pre-generated response options to "keep the dialogue moving smoothly" (P11).When engaging with the agents, participants opted for pre-generated response options 55% of the time, while they chose to manually type text for the remaining 45% of the interactions.Intriguingly, three participants were inspired by the content of certain posts and asked the agents about their opinions on those specifc topics.And regarding neutral and negative feedback, participants primarily raised concerns about the format of the LLM-generated responses.Some noted that "the generated text may be too long and complex for people with lower levels of education to comprehend" (P3), and others expressed expectations, such as "adding pictures or visual elements to the current text-only conversation could enhance clarity" (P18), suggesting the possible refnements in the future.
Table 2 outlined participants' number of interaction rounds, as well as their ratings for both the pleasure and helpfulness of conversations with each AI agent.Agent 5 got the most interaction rounds, but also received the lowest ratings for both pleasure and helpfulness.Post-interview revealed that as Agent 5's viewpoints were markedly diferent from those expressed in the posts (and potentially from the participants' own perspectives), some participants expressed a desire to "debate with and convince him" (P3).

Diversity in Information Acquisition (DC1)
two coders identifed that the responses of the AI agents consistently aligned with their pre-determined attitudes, ranging from 5.2.1 Role Setings Allow Conversations with Various Perspectives.
dissatisfaction to support for the policy.When asked to evaluate The core feature of our prototype is to provide varied perspectives their level of agreement regarding whether engaging in conversawith multiple AI agents.Results indicated that the role and perspections with AI agents would help acquire more diverse information, tive settings, which are generated by GPT-4 (Table 1), efectively 12 out of 18 participants rated agree/strongly agree, 4 rated neu-"ofered distinct perspectives" (P7, P11).In the thematic analysis, the tral, and 2 rated disagree/strongly disagree (Mean = 3.72, SD = 0.89).Those in favor highlighted the beneft of accessing diverse perspectives through interactions with AI agents, as P17 articulated: "I could immediately see the diferences in perspectives from diferent AI identities, which broadened my desire to explore a wider array of information." Some discontent was noted regarding the "predictability of the perspectives based on the characters' identities" (P3) and the viewpoints "not exceeding existing scope of knowledge" (P16).Interestingly, we found the generated perspectives could also be novel and insightful.During the post-interviews, 11 participants pointed out that their interactions with the AI agents introduced them to previously unconsidered and inspiring viewpoints.Just as P4 stated that "some viewpoints of the agents were unfamiliar to me, which enriched my understanding of others and the society."Additionally, two participants mentioned that the generated response options could also be inspiring.These options, as P4 noted, could "stimulate and guide directions of the conversation".
However, 12 participants pointed out AI agents' responses often tended to be broad and vague.Besides, both the tone and content of the AI agents appeared to "converge as the conversation progressed" (P6), focusing predominantly on the pros and cons of the policy and related policies around the world, so "they must all be driven by the same underlying AI" (P16).It is noteworthy that the reported "convergence" was specifc to the dialogue content itself as conversations between AI agents and human users evolved, while the attitudes and perspectives of the AI agents towards the policy remained unchanged.This occurred because the AI agents did not articulate their assigned viewpoints in each response to users, similar to how we as humans might not fnd it necessary to constantly restate our stances throughout a conversation.

Viewpoints Puzzle Encourages
Explorations with Various Perspectives.Regarding the usefulness of the Viewpoints Puzzle design, there was a mix of positive, neutral, and negative ratings among the participants (Mean = 3.67, SD = 1.03).12 out of 18 rated agree/strongly agree for this design being useful, recognizing it could provide "a full picture of all the perspectives" (P18); 4 rated neutral, and 2 rated disagree/strongly disagree, citing hesitations like "it is only likely to be efective when I have a lot of free time" (P14).Several participants utilized the Viewpoints Puzzle as an index, navigating through it to engage in conversations with various AI agents.As evidenced in the transition diagram between various interface elements (Figure 5), a portion of participants accessed the dialogue window through the Viewpoints Puzzle, and some of them did so by clicking on individual agents' puzzle pieces.Besides, two participants recommended enhancing the Viewpoints Puzzle by including additional information, thereby allowing them to "more efectively understand the core ideas of the perspectives" (P18).

5.2.3
The Conversation Mode Facilitates a Rapid Understand of New Topics.Several participants found the prototype particularly benefcial when exploring new topics.As P17 stated that the prototype was "informative and valuable" for such endeavors and P6 stated that "the conversation mode provided me a more convenient and efcient way to acquire new and diverse information".Table 2 reveals that Agent 5 (the economist) and Agent 3 (the HR manager), who were generally considered more knowledgeable about the policy under discussion, was engaged in the highest number of conversational rounds.Additionally, when it came to asking questions in terms of policy, these two agents were the most frequently queried (Agent 3: 24%; Agent 5: 23%), on topics such as factors to consider in policy implementation and international policy practices.These suggest that participants tended to consult these potentially knowledgeable agents for basic knowledge on the topic, which was echoed by P13 who expressed a desire for the AI agents to "provide some basic knowledge about this topic".

Depth of Information Processing (DC2)
5.3.1 Talking with Multiple Characteristics could Stimulate Users' Reflection.Overall, participants believed that conversations with AI agents facilitated deeper contemplation on the topic (14 out of 18 rated agree/strongly agree, 3 rated neutral, and only 1 rated disagree/strongly disagree, Mean = 3.78, SD = 0.65).During the conversations, participants could opt to respond either by typing or by selecting from the system-generated options.Log data revealed that 17 out of 18 participants engaged in typing at some point, despite being explicitly requested to do so during the study, suggesting a certain level of deliberation, as opposed to a shallow interaction through clicks.Furthermore, semantic coding of the participants' responses showed that 12 participants have typed their own opinions, oppositions, or counter-questions to the AI agents, thereby further substantiating the argument that there is a certain level of in-depth thinking during conversations.
We identifed two primary reasons according to the feedbacks during the post-interviews.Firstly, the conversational nature was conducive to stimulate deeper thinking.As the participants pointed out, "it allows me to discuss with 'people' about the topic, and the very process could prompt me to think more deeply" (P18).P16 noted that "it's not always convenient to discuss these topics with friends, and if you try to engage through forum posts or comments, there may not be immediate or any responses.On the contrary, interacting with the agents in the system is convenient and inspiring." Secondly, due to the presence of multiple AI agents, the diversity of their viewpoints also served to facilitate users to "understand and refect on the topic from various perspectives" (P6).P7 noted that "engaging in conversation with diverse characters with diferent viewpoints made me think more critically".However, 4 participants also expressed concerns about the credibility of the AI agents, as it "lacked evidence and concrete examples to substantiate their claims" (P12).In the post-survey, 7 out of 18 participants indicated that they disagree/strongly disagree with the statement that they could be persuaded by the AI agents (9 rated neutral, 2 rated positive, Mean = 2.61, SD = 0.85).The lack of credibility "somewhat limited my inclination for in-depth and meaningful discussions with the agents" (P18).

Role of Response Options.
Participants also pointed out that the response options allowed them to "think more extensively" (P11).Since we designed the provided options to be questions that could be asked based on the agent replies, these questions encouraged participants to "further inquire and engage in dialogues" (P5), probably because the options have stimulated participants' curiosity.

Role of the Viewpoints
Puzzle and Multi-choice Qestions in Summarization.Some participants stated that the Viewpoints Puzzle served as a useful tool for "summarizing and organizing various viewpoints" (P3, P11).P12 went further and suggested the map could be "organized according to the viewpoints, such as along an axis indicating support versus opposition." Regarding the multi-choice questions, 13 out of 18 participants rated that these questions facilitated a better understanding of each AI agent's perspective, while 5 rated neutral (Mean = 3.78, SD = 0.55)."The content of the questions are concise, helping me to easily grasp the main ideas" (P14) and "enhance understanding" (P17), though P18 expressed that he did not think the assessment module to be necessary, because "Individuals naturally comprehend viewpoints that interest them without needing specifc assessment tools.Viewpoints that fail to capture one's interest are not seen as crucial to understand." In addition, an intriguing behavioral pattern emerged: some participants would click back and forth to review the chat logs while answering the questions.A typical example is P6 (Figure 4c, where P6 reviewed the dialogue history while answering the multiple-choice questions).This further attests to the role of multiple-choice questions in encouraging deeper processing of the contents.

User Engagement (DC3)
5.4.1 Efect of Gamification Design.Overall, participants were willing to engage with the AI agents to acquire information (13 out of 18 rated agree/strongly agree, 3 rated neutral, 2 rated disagree/strongly disagree, Mean = 3.67, SD = 0.77).They were also motivated to interact through the gamifed feature of "lighting up" the ViewPoints Puzzle.On average, participants illuminated 4.61 agent avatars by completing fve or more rounds of conversation with each agent; notably, 12 participants successfully lit up all the avatars.Similarly, through answering multiple-choice questions, participants on average lit up 4.33 puzzle pieces of the agents; 11 participants successfully lit up all the puzzle pieces.Most participants acknowledged a desire to illuminate the ViewPoints Puzzle, with one participant noting that successfully doing so acted as "positive reinforcement that made him feel he have got to know the corresponding agents" (P16).However, a few participants suggested that "more tangible rewards would be more useful" (P1), while P18 felt that the map gave him "a sense of obligation rather than motivation." 5.4.2Efectiveness of Role-playing of the AI Agents.During the conversations, all participants utilized the second-person pronoun to talk to the AI agents, such as "What do you think about this issue?"(P1) or "I don't think you are right" (P5).Furthermore, 13 out of 18 participants had asked the agents at least one personal questions, such as "What are your plans?" (P1) or "Have you considered changing careers before retirement?" (P3).These indicate that our embodiment of the AI agents was efective, as users indeed treated them as distinct characters.
However, many participants pointed out that the efectiveness of AI agents in role-playing is still lacking in two aspects.Firstly, the authenticity of role-playing was inadequate, particularly for blue-collar workers.As expressed in the interview by participants, "A deliveryman being highly knowledgeable about policies seemed unrealistic and inconsistent with my expectations" (P17); "I prefer talking to agents whose statements align with their identity" (P10); "When I asked factual questions, like existing policies, the answers were quite similar across the agents" (P4).
Secondly, the AI agents fell short in their ability to convincingly playing "real humans"."They don't feel personal enough, " said one participant (P6).This limitation may be related to the fact that we limited the response length of the agents in our prompts."Their responses were all of a uniform length, which is not very human-like.A mixture of long and short responses would be more realistic" (P16); "I wish the format of the responses could be more diverse, such as including images or emojis" (P18).

Cognitive Load.
Scores from the NASA-TLX scale indicated that the system did not impose a signifcant burden on the participants (Table 3).Specifcally, the response options efectively reduced participants' cognitive load, as evidenced by the fact that 55% of user responses were made by clicking on these options.Participant P18 noted, "The setting of the options is great; they were diferent from one another, and I could basically always fnd what I want." 5.4.4User Perceptions of the Entrance.Regarding the setting of the entrance, participants have varying suggestions.Some participants suggested more interaction between the multi-agent system and the posts, such as "including AI agents' responses in the post might make me feel more engaging" (P10) and "hoping to discuss the post content with the agents" (P2).Some participants hoped for a permanent entrance for on-demand access, which would bring "a sense of control.
It currently looks like the posts, which can be accidentally clicked on" (P18).In addition, some participants suggested the system could "automatically detect if I'm currently in a flter bubble and provide new perspectives accordingly" (P12).

DISCUSSION
Our research aimed to address the two RQs outlined earlier: how such an LLM-powered system should be designed, and whether and how such a system may help users access and refect on diverse information.For the frst RQ, we orchestrated a participatory design workshop to brainstorm ideas, from which we derived three design considerations.Then we defned key interaction features accordingly and fnalized the prototype design.For the second RQ, we implemented this prototype featuring LLM-powered multi-agent characters that participants interacted with while reading social media content and ran an evaluative study.Our analysis, including participants' rating scores, interaction patterns, and interviews, unveiled three main insights: • Participants demonstrated interest in interacting with the LLM-powered multi-agent system.Even when the AI agent's viewpoints challenged their existing beliefs (e.g., Agent 5), they were willing, if not more inclined, to engage in dialogue, facilitated by well-designed gamifcation incentives and an inherent motivation probably driven by curiosity.• Progressive interactions with assessment tasks, could deepen participants' understanding of opposing viewpoints and provoke thoughtful and careful considerations among them, an essential step towards escaping flter bubble.• Two main technical barriers were revealed based on participants' concerns for leveraging current Large Language Models to efectively deliver diverse perspectives: inaccurate character representation, and over-generalization lacking contextual depth.
In this section, we delve into these insights one by one, and discuss design implications with the outlook of future work for better assisting users to burst their flter bubbles.

Motivating Engagement through Exploratory Time
Our study showed that when users conversed with AI-generated multi-agents possessing diverse viewpoints, they displayed a desire to understand the reasoning behind these perspectives and how they were formed, rather than dismissing them.Most users enjoyed the experience interacting with diferent roles and found it helpful for them to get diverse information.In fact, some participants even deviated from the predetermined sequence of conversations to prioritize interacting with the agents they found most intriguing, indicating a signifcant degree of motivation and engagement.Such motivation was further enhanced with the introduction of a small design feature where participants could lighten all fve pieces of the Viewpoints Puzzle after completing interactions with all the multi-agent characters.GPT-4, have demonstrated the ability to convincingly portray multiagent characters with extensive domain-specifc knowledge [11,53].This breadth and expertise allow each character to generate distinctive viewpoints with compelling reasoning that is consistent with their character [43,73].Our designed characters span a range of professions, from economists to blue-collar workers, ensuring that the perspectives presented will not be limited to or infuenced by the background of any particular group of people.Conversational interfaces could help people retrieve information quickly, as the natural conversational fow allows people to get concise and relevant information, and the interactive nature of conversation can adapt to users' needs in real-time [18].In our study, participants also reported that dialogue fow design is helpful for their rapid understanding of a new topic.As the conversation developed, users had the opportunity to engage with each character by asking questions, and responses were generated instantly by GPT-4.This experience stood in contrast to traditional online content platforms, which were often difcult to interact with other users in real-time through comments or posts.Consequently, the direct and interactive mode of conversing with AI agents about specifc topics emerged as a compelling option for users browsing online media content.
Participants exhibited interesting behavior by asking some AI agents to change their roles and answer the same question again (P16).Some participants were even curious about what would happen if AI-generated characters interacted and discussed their perspectives with each other.Prior work has explored the design and development of a virtual world using Large Language Models, in which generative agents have simulated minds with memories and experiences that allow them to interact with each other [53].Future work can explore how these social interactions among AI agents can beneft engaging people in diverse perspectives.
While the efectiveness of generating diverse outputs by LLMs is established, the consistency of these outputs over extended conversations remains an open question.Future work is necessary to explore the utilization of interactive design features and humanin-the-loop feedback to ensure that the same prompt consistently produces responses with similar viewpoints at diferent times, maintaining a consistent character voice.

Design Implication 2: Design Gamification Incentives to Promote. Exploration
The integration of game design elements into applications has gained increasing interest in recent years [40].In education, healthcare, and customer engagement, gamifcation incentives can serve as a powerful tool to motivate users to achieve specifc goals or outcomes through game design elements, such as points, badges, leaderboards, etc. [14,24,34,77].
In our prototype, we have designed gamifcation incentives accordingly, i.e., collecting all pieces of the Viewpoints Puzzle as a common form of badges, to encourage participants to engage more with new and challenging information.Our study suggested that the puzzle collecting design efectively motivated users to explore and seek out information with diverse perspectives.Participants frequently mentioned using the Viewpoints Puzzle as a navigation bar due to its intriguing nature, with many wanting to discover what happens next by clicking on diferent parts.Additionally, the design instilled a sense of "winning desire" in some participants, leading them to desire to collect all pieces of the puzzle through interacting with multi-agent characters.Therefore, the gamifcation incentives might make users more inclined to interrupt their habitual consumption of scrolling through media content and engage with the system.
It is worth to note that one participant exhibited an interesting behavior pattern that we referred to as "rushing to the fnish line" (P2).She focused exclusively on assembling all the puzzle pieces as quickly as possible, rather than taking the time to ask questions and understand the perspectives generated by the characters.This fnding suggests that excessive reliance on gamifcation may lead to some users prioritizing the completion of tasks or achieving rewards over the actual learning engagement itself.Previous research has found that extrinsic rewards might undermine users' intrinsic motivations [19].Further work could explore designing interactions that tap into users' intrinsic motivations to create experiences that prioritize genuine engagement and avoid the potential pitfalls of over-gamifcation.For example, providing generative feedback that highlights the user's progress and understanding.Moreover, rewards and incentives can be properly designed to encourage collaboration among multi-agent and human users, shifting the focus from individual rewards to collective achievements and shared experiences.

Designing Progressive Interactions with Assessment Tasks to Enhance Deep Thinking and Understanding
Our work explored the use of progressive interactions with assessment tasks to encourage critical thinking and understanding of diverse perspectives.User study indicated that these types of interactions could encourage participants' deliberate consideration of diferent viewpoints.The progressive interactions facilitated critical thinking by gradually increasing complexity and diversity, as participants engaged in careful considerations and thoughtful discussions while completing assessment tasks.
6.2.1 Design Implication 3: Providing Progressive Interactions.Previous work suggests that structured progressive interactions could enhance critical thinking abilities among people [26,74,80].Our study extends prior work by showing that presenting diverse viewpoints through natural conversational interactions with AI-generated characters encouraged participants to give careful consideration to new information.
In our prototype, two progressive interaction designs were implemented to promote deliberate and critical thinking.Immediate feedback from AI-generated dialogue serves as a natural progression, providing contextual information such as reasoning chains, examples, and stories through questioning and answering, leading participants towards a deeper understanding of difering viewpoints.The progressive role setting, starting from the most similar viewpoint to the original, then gradually introducing more nuanced and diverse perspectives, guided participants from a basic understanding of their existing beliefs to more critical thinking.However, the evaluation of users' perceptions towards the preset presentation order, as well as the efectiveness of this sequence for all users, remains unexplored.In addition, text-based dialogues alone may not fully capture the nuanced information present in human conversations [35,55].Future work could incorporate more multi-modal interaction techniques such as vocal emotions, micro expressions, and body languages, in detecting users' intent, attitude, and familiarity with the topic and viewpoints, to promote deep thinking through more customized feedback.

Design Implication 4:
Designing Assessment Tasks.As discussed previously, incorporating game elements into the design could foster user engagement in diverse perspectives.By presenting assessment tasks, such as multi-choice question sessions, along with providing gamifcation incentives, we could further promote deep thinking and create a synergy where the total impact is greater than the sum of its parts [12,34,36].Our study demonstrated that some participants also frequently switched between conversations with AI agents and the assessment tasks, indicating a higher level of thinking and comprehension.However, the optimal balance between gamifcation incentives and the challenge of assessment tasks remains unclear.Some participants (P3) described the assessment tasks as "rigid like a quiz in high school class, " while others (P16) found the multi-choice questions to be "too easy without challenge."Further research is necessary to develop adaptive assessments with feedback loops that align with the engagement and thinking levels of users, to facilitate continuous improvement.

Technical Barriers and How to Overcome
Although our work demonstrates the promising capabilities of LLMs in content generation and anthropomorphization, enriching user engagement and fostering deep thinking about diverse perspectives, some challenges and concerns also came to light.In our study, participants identifed two technical barriers: inaccurate character representation, and lacking contextual depth.Some participants noted that the tone of some responses did not match the character's personality.Additionally, some participants reported that some AI-generated content lacked necessary topic-specifc details and elaborations, resulting in generic and shallow responses.These fndings are in line with prior work that examined the performance and capabilities of LLMs in content generation and emulating characters [57,68].
These technical challenges need to be addressed to utilize these capabilities efectively and responsibly.We propose the following design implications for creating more inclusive and accurate experiences to navigate people out of their flter bubbles.

Design Implication 5: Improving
Inaccurate or Biased Character Representation.Large Language Models are trained on extensive data from the internet, which can lead them to refect biases present in those datasets [11].When asked to represent a character or perspective that are underrepresented in their training data, the outputs can be inaccurate [23,57,71].For example, the GPTgenerated characters in our study exhibited some gender biases, such as women play the role of Janitor and HR manager while men are Entrepreneurs and Economics Professors.These biases could be attributed to the stereotypes that model inherited from the internet data used for training.To mitigate such and similar issues in the future, it is crucial to employ more carefully curated data and fnetuned models, adhere to ethical and responsible AI guidelines, and incorporate human oversight before deploying such system in realworld applications.In addition, engaging in discussions on a topic with users typically requires LLMs possess relevant background knowledge about the subject.For example, in our study some user asked the AI agents about retirement and pension policies in other countries.Although GPT accurately retrieved information on policies in Germany and Sweden, it sometimes incorrectly stated that Dutch residents had fexibility in choosing their pension age between 60 and 70, which was unverifed according to our best online search.These unconfrmed information could lead to mistrust and potentially detrimental outcomes among users who rely on it.Thus, it is crucial to address issues such as hallucinations [78] or factuality issues [66] before deploying this system in practical settings.While prior work is limited in directly addressing these issues, research has shown potential in utilizing Reinforcement Learning from Human Feedback (RLHF) to fne-tune language models based on human feedback to better align with human intent [29,52], or prompt engineering techniques to generate contents that follow factual information [63].Future work can focus on collecting enriched information and building character-centered datasets to further fne-tune LLMs, generating more accurate and fair representations of characters.

Design Implication 6:
Prompting with Interactive Design to Enhance Contextual Depth.Due to the nature of generative language models, they do not possess the same level of human cognition to understand deep cultural, historical, or emotional contexts of characters and perspectives [4], which can result in outputs that lack contextual depth.However, there is evidence that LLMs have signifcant potential in few-shot learning and in-context learning [3,17].Just changing a few examples or prompts can help LLMs adjust their generated content, mitigating inaccuracies in generated content [43,73].Future work should explore interactive design techniques to make it easier for users to edit and iterate prompts or provide examples that aligns with the detailed contextual background, personality, and nuances of a particular character.

Limitations and Future Work
There are three primary limitations in this study: the system's usefulness awaits further evaluation, the limited scope of the prepared topic, and the limitations of the laboratory study setting.First, we aimed to explore how such a system should be designed (RQ1) and whether it could help users access and refect on diverse information (RQ2) in our study.However, the extent to which encouraging users to access and contemplate diverse information contributes to efectively breaking the flter bubble is yet to be determined.Future research could compare participants' initial opinions before using the system and their post-use opinions, or contrast experiences with and without using the system.Additionally, conducting a controlled study to compare the utility of our system with other methods aimed at helping users overcome the flter bubble would also be helpful.Second, the selected topic of delayed retirement policy, may impact the generalizability of the results, as participants may lack interest or motivation to engage with the LLM-powered multi-agent characters for in-depth discussion on such topic.Future work can consider exploring a broader range of topics to investigate whether and how such LLM-powered multi-agent system may help users burst flter bubbles in a variety of contexts.Third, we conducted the evaluative study in a laboratory setting, where participants were required to complete tasks independently within a limited timeframe.However, it is possible that participants' preferences and behaviors may difer if they were to interact with the system in a more fexible and extended setting.For example, one of the most intriguing questions posed is whether users will voluntarily pause their online browsing activities to engage in 5-10 minute conversations with our multi-agent characters without explicit requests.Although these interactions may initially seem unnatural, refecting our own experiences with social media, observations from our laboratory experiments suggest an interesting potential where users may feel motivated to interact with the system voluntarily when they could freely browse social media.For instance, many (N=9) participants reported that the desire to win the game led them frequently jumped back and forth to interact with diferent agents more than required, in order to correctly answer questions and collect puzzle pieces.P16 deviated from the task by asking some agents to play a diferent role and answer the same question again.Also notably, P3 and P16, on their spontaneous initiative, even proposed if they could have agents discuss among themselves and come back with new responses.However, we acknowledge that more systematic studies are warranted to further investigate this open question.For example, future work should consider conducting a longer-term feld study to investigate how social norms, communication, and interactions among users may impact their information consumption and the potential of the system to promote diverse perspectives in real-world settings.
In future research and real-world applications, there are also several aspects of our system that can be improved.Firstly, the display of the entrance is determined by a predefned rule in our current system.For future research and practical implementation, the timing of the system's entrance display deserves more careful consideration.To achieve the efect of displaying the entrance when users need it, future systems could assess the extent of the flter bubble, such as whether the attitudes are one-sided or the online voices are self-reinforcing, and then determine the timing of the system's entrance accordingly.Secondly, future research could consider the states of users during their interaction when designing the system.On one hand, by monitoring user interaction behaviors, the agents' responses could be dynamically adjusted.For instance, if semantic analysis detects a user becoming irritated with an AI agent, the agent could employ techniques to soothe and stabilize the user's emotions.On the other hand, as continuous interaction [2] and motivation to comprehend an agent's behavior [81] may enhance users' tendency of anthropomorphism and result in over-trust towards the agents [20,37,65], interventions should be implemented upon detecting signs of over-trust or negligence.Indicators could include but not limited to overly rapid responses, showing complete agreement in dialogue, or engaging predominantly with a single AI agent.

CONCLUSION
In today's world, new technologies such as AI-powered search and recommendation systems are implicitly infuencing the way people consume information.Unfortunately, this can result in people being trapped in isolated flter bubbles with narrowed perspectives and reinforced biases.Escaping these flter bubbles can be challenging, as it requires not only exposing users to diverse information but also motivating them to engage with that information, especially opposing viewpoints, through in-depth thinking.Our research aimed to understand how to design a system that leverages the power of Large Language Models to address the issue of flter bubbles, and whether and how such a system could help users broaden their perspectives.To achieve this, we conducted a participatory design workshop that involved various roles such as HCI and UX researchers, designers, and psychologists, all of whom are also users of online content platforms.Through this process, we identifed three key design considerations with distinct interaction features that could promote users towards diverse perspectives.In light of these considerations, we designed and developed a prototype with LLM-powered multi-agent characters that users could interact with while reading social media content, and conducted an evaluative study with 18 participants.Based on our fndings, we extracted six design implications and discussed future work outlook for researchers and designers to consider and explore when designing generative multi-agent systems to better assist people in bursting their flter bubbles.

A PROMPTS
Prompts used for generating AI agent personas, AI agent conversation opennings and user response options, AI agent responses and user response options, and assessment tasks are shown in Figure 6, Figure 7, Figure 8 and Figure 9, respectively.

B EXAMPLES OF CONVERSATION LOGS
Following are the conversation logs from three example user.Specifically, User (type) denotes that the user responded to the agent by typing in the input box, and User (choose) indicates that the user interacts with the agent by selecting response options provided by the system.The conversations took place in Chinese and were subsequently translated.while preserving my independence and dignity.However, the issue at hand still revolves around the unfairness of the delayed retirement policy towards us, the physical laborers.

B.1 P1: Interest-driven Conversation Order
Agent 1: • User (type): Do you have any plans if you retire early?
• Agent 1: Should I opt for early retirement, I might consider opening a small, simple shop or engaging in light physical work to continue providing for my family.I understand the government's concerns, but I believe they should also take into account our actual circumstances.• User (type): Why do people want to retire early?
• Agent 1: The desire for early retirement primarily stems from a concern for one's own well-being and that of one's family, as physical laborers cannot always guarantee maintaining good health.Additionally, some people believe early retirement ofers the opportunity to pursue personal interests and enrich their lives.Don't you think that this too represents an ideal lifestyle?
Agent 3: • User (type): I agree.How do retirement regulations in other countries compare to ours? • Agent 3: Retirement regulations vary signifcantly from country to country.For example, Australia ofers a universal pension and compulsory corporate pensions, allowing employees to choose their retirement time upon meeting certain conditions.Germany, on the other hand, implements a policy of gradually delaying retirement age, ofering a degree of fexibility for employees.
• User (type): "Then why don't these countries restrict retirement time?" • Agent 3: "These countries do not restrict retirement time because they believe that ofering fexible retirement policies better meets the life needs and actual working conditions of employees, making the pension system more sustainable.This approach strikes a better balance between the interests of individuals and the nation, while respecting individual choices.
Agent 4: • User (type): As an investor, wouldn't you feel despondent about life if your entrepreneurial venture failed, leaving you both fnancially and materially depleted?• Agent 4: Indeed, the failure of a business venture poses a signifcant challenge, but life is inherently flled with ups and downs.Maintaining an optimistic and resilient attitude, learning from failures, and continuing to strive forward are crucial in overcoming setbacks.Moreover, the essence of life is not solely found in wealth; support from family and friends is also immensely important.As long as we stay positive, we can always fnd ways to overcome difculties.• User (type): But if you go bankrupt in your thirties, you still have to fnd a way to get through the next 30 plus years.• Agent 4: Yes, facing bankruptcy in one's thirties is indeed a daunting challenge.However, during this period, one can explore new employment opportunities or learn new skills in diferent felds to bounce back.Life is full of uncertainties, but we must believe that we will fnd a path that suits us.

Agent 5:
• User (type): From an economic perspective, what you say makes sense; but from the standpoint of people's livelihood, it seems overly exhausting.• Agent 5: I understand your concerns.Issues regarding people's livelihood are indeed paramount, and it's essential to consider everyone's needs and pressures.In fact, policies like delayed retirement could be gradually implemented, accompanied by training and educational measures to enhance the quality of the workforce, thereby supporting livelihoods.We all strive for a better quality of life.• User (type): But doesn't improving quality also require spending money?Where will the funds come from?• Agent 5: Yes, enhancing quality does require fnancial investment, but such expenditures can yield long-term benefts, such as increased overall labor productivity, higher tax revenues, and support for sustainable fscal development.From this perspective, it represents a valuable investment.• User (type): With the existing generation gap, how would a 21-year-old university graduate leading a team communicate with a 65-year-old employee in the future?• Agent 5: I understand your concerns about the challenges posed by age diferences.In reality, collaboration between multiple generations in the workplace can facilitate the exchange of experience and new ideas, enhancing the team's innovative capabilities.We can improve intergenerational understanding and cooperation by strengthening communication skills training.
• Agent 2: Yes, I'm glad you understand our perspective.The implementation of delayed retirement policies should take into full account the actual conditions of diferent industries and workers, with special attention and support directed towards physical laborers.• User (type): Could it be that the current system is still imperfect?• Agent 2: It is possible that the existing system is not yet perfect.Policymakers could further refne the policies to accommodate the needs of diferent industries and workers, thereby benefting a broader spectrum of people.
B.2 P18: System-guided Followed by Interest-driven Conversation Order Agent 1: • User (choose): Are there international examples of setting diferent retirement ages for diferent industries that we can learn from?• Agent 1: Yes, some countries have implemented retirement policies with diferent retirement ages for diferent industries, allowing physical laborers, for instance, to retire earlier.This approach is more humanized and takes into account the actual conditions of employees in diferent industries.I believe our country should also look into these examples for reference.• User (type): How is physical labor defned, as some jobs might fall between the two categories?• Agent 1: Indeed, defning the boundary between physical and non-physical labor can sometimes be challenging.We can involve relevant experts and experienced individuals in setting these standards, and also consider the specifc characteristics and job descriptions of diferent industries to ensure that the policy is more targeted.• User (choose): How would you determine the retirement age for positions that involve both physical and non-physical labor?• Agent 1: For such positions, we can refer to existing job classifcations and specifc job content to assess the proportion of physical and non-physical labor involved.In making actual determinations, factors like the employee's age and years of service can be considered, leading to the formulation of more fexible retirement policies to ensure fair treatment for employees in diferent positions.• User (type): How would you address individuals who exploit loopholes in these policies?• Agent 1: To address those exploiting policy loopholes, it's crucial to strengthen the supervision and enforcement of these policies.Individuals or businesses attempting to manipulate these loopholes for personal gain should face stringent fnes and penalties.Additionally, improving the policies by developing more comprehensive and detailed regulations can help eliminate potential loopholes.
Agent 2: • User (choose): I agree with your viewpoint, but diferentiated retirement standards may lead to certain complexities.What challenges might arise in implementing such a policy?• Agent 2: Indeed, diferentiated retirement standards do introduce complexities, such as determining evaluation criteria for diferent industries and setting specifc retirement ages.However, I believe that by developing rational policies and detailed planning, we can fnd a fairer solution that accommodates the actual conditions of each individual.• User (choose): How should the government balance the needs of diferent industries and people to formulate a more targeted diferentiated retirement policy?• Agent 2: The government can balance these needs by conducting thorough research to understand the actual conditions and demands of various industries, and by setting diferent retirement ages based on the nature of diferent jobs and individuals' health conditions.This ensures that the retirement policies are more in line with reality.Additionally, the government should strengthen support and benefts for physical laborers.• User (choose): When implementing diferentiated retirement policies, if dissatisfaction arises among people in certain industries or positions, how do you think the government should adjust and respond?• Agent 2: The government needs to actively listen to the opinions and needs of people in diferent industries and positions.This can be done through communication and dialogue to minimize discontent.Timely adjustments to the policy may be necessary to balance the interests of all parties and build consensus.• User (choose): How can the transparency and fairness of differentiated retirement policies be ensured during implementation, to prevent some people from feeling discriminated against?• Agent 2: Ensuring the transparency and fairness of these policies lies in establishing detailed regulations, treating all industries and positions fairly, and seeking public opinions transparently.These methods can ensure that everyone's rights are respected and help prevent any form of discrimination.
Agent 3: • User (choose): If employees choose to delay their retirement, will their pensions increase?• Agent 3: Yes, typically, when employees opt for delayed retirement, their pension amounts are likely to increase.This is because the extension of their working period leads to longer contributions to the pension fund.This can be an additional incentive for those who feel they still have the capacity to work.• User (choose): Then, for those who wish to retire early because they feel unable to continue until the standard retirement age, does the government have policies in place to assist them?• Agent 3: For individuals looking to retire early, the government may ofer certain adjustment options based on their specifc situations, such as the possibility of accessing part of their pension earlier.However, these specifc policies and provisions would need to be adjusted according to the societal context and individual circumstances.• User (choose): Do you think the current pension system is universally applicable, and why? • Agent 3: The current pension system, to a certain degree, meets the retirement needs of most people.However, it may not be fully adaptable to some specifc situations.There's a need to increase the system's fexibility and tailor it to address various practical issues, allowing more people to beneft from the social security system.Agent 3: • User (choose): If employees opt for delayed retirement, will their pension benefts increase?• Agent 3: Yes, if employees choose to delay retirement, their pension benefts may increase due to the extended period of contributions towards the pension fund, resulting in a larger accumulation of retirement insurance funds.• User (choose): How do you believe the policy of delayed retirement should balance the interests between the younger generation and the middle-aged and older individuals approaching retirement?• Agent 3: I believe that the younger generation should indeed bear a greater share of the responsibility for social security to ensure the sustainable development of society.Only when all age groups undertake their due responsibilities can we collectively address issues such as labor shortages.The younger generation can create more value for themselves and their families through self-improvement and investment.• User (choose): What advice do you have for the younger generation on planning for their future retirement and making pension investments? • Agent 3: I recommend that the younger generation begin planning for retirement early by enhancing their pension reserves through various channels, such as actively participating in both corporate and personal pension insurance and investing in fnancial products.Additionally, developing good consumption and fnancial management habits, along with continuous learning and career development, is essential to meet future challenges.• User (choose): How should one decide between personal pension insurance and corporate pension insurance?• Agent 3: When deciding between personal and corporate pension insurance, it is important to comprehensively consider your actual needs and fnancial situation.It is advisable to analyze the contract terms, coverage scope, and investment returns of both options to fnd a balance that meets your personal needs, risk tolerance, and expected returns.
• User (choose): How do companies that establish corporate pension insurance schemes handle this insurance when an employee leaves the company?• Agent 3: When an employee leaves, companies typically follow relevant regulations and the terms of the corporate pension insurance contract to process.The portion of the pension that the employee has contributed to may be transferred to their personal pension insurance account to safeguard their pension rights after departure.
• Agent 1: Yes, as physical laborers like ourselves, although our pensions might be relatively lower, we too have the right to enjoy retirement.Our generation has contributed to society's development and deserves equal treatment.• User (choose): What age do you think is appropriate for physical laborers to retire?" • Agent 1: "I believe it is appropriate for physical laborers to retire around the age of 60.This ensures that they still have the physical and mental capacity to complete their work before retirement, while also allowing them to enjoy a longer period of retirement and take good care of themselves.• User (choose): Do you think delayed retirement is more suitable for non-physical workers?• Agent 1: For non-physical workers, delayed retirement is indeed more appropriate, as their work relies more on knowledge and experience, which can actually enrich with age.However, for each individual, we need to consider their physical health and job requirements to make a decision.• User (type): What do you think is the rationale behind delayed retirement?• Agent 1: I believe the rationale behind delayed retirement is to address issues arising from an aging society, such as a decrease in population and insufcient pension funds.Delaying retirement can help reduce the pressure on pension systems and meet the long-term developmental needs of the country.
Agent 2: • User (type): How much pension do cleaners receive after retirement?Is it enough to sustain their living?• Agent 2: The amount of pension depends on the individual's working duration and the base of contributions.Generally, it should be sufcient to maintain basic living standards.However, as a physical laborer, my primary concern before retirement is whether I can maintain my health until the age of 65. • User (choose): In your view, should physical laborers consider retiring earlier?• Agent 2: I do believe that physical laborers should consider retiring early due to the signifcant stress and fatigue our work places on our physical health.Early retirement would not only safeguard our health but also create more job opportunities for the younger generation.
• User (choose): Have you thought about how to deal with potential fnancial pressures after early retirement?• Agent 2: Yes, when considering the possibility of early retirement, I am indeed worried about fnancial pressures.I plan to focus more on savings and investments to adequately prepare for retirement.Additionally, I intend to seek part-time work involving light physical or non-physical labor to ease the fnancial burden.• User (type): What is the rationale behind the delayed retirement policy?• Agent 2: The primary rationale behind the delayed retirement policy is to address the challenges of an aging population.By raising the retirement age, the policy aims to reduce the burden on pension funds and make full use of the labor resources of the elderly.However, I believe this one-sizefts-all policy fails to consider the unique characteristics and needs of diferent industries and professions, particularly those of physical laborers.
Agent 3: • User (choose): If employees choose to delay their retirement, will their pensions increase?• Agent 3: Yes, if employees choose to delay their retirement, their pensions will likely increase due to the extended period of social security contributions.The amount of pension is infuenced by the length of these contributions, although this can vary based on specifc circumstances.• User (choose): As a physical laborer, do you believe that professionals in non-physical jobs have more advantages under the delayed retirement policy?If so, could you share your perspective?• Agent 3: Indeed, I believe professionals engaged in nonphysical jobs have more advantages under the delayed retirement policy.Their work nature is relatively easier and causes less physical strain, making it easier for them to maintain good health and work longer.• User (choose): For those in less physically demanding jobs, do you think they can make a greater contribution to society if they delay retirement?• Agent 3: I do think that individuals in less physically demanding jobs can make a greater contribution to society if they delay retirement.By extending their working years, they can leverage their vast experience and specialized knowledge to play a more signifcant role in their professional felds.• User (type): Do you think delayed retirement reduces job opportunities for young people?• Agent 3: The delayed retirement policy might have some impact on the employment opportunities for young people, especially in a competitive job market.It could increase the pressure on young people looking for jobs.• User (choose): If a fexible retirement policy is implemented, what impact do you think it would have on the employment of young people?• Agent 3: Implementing a fexible retirement policy would have both positive and negative impacts on the employment of young people.On the positive side, early retirement of physical laborers could create more job opportunities for the younger generation.On the negative side, a fexible retirement policy might intensify the competition young people face when seeking jobs.Overall, however, it could help create a more equitable job market.
Agent 4: • User (choose): As you mentioned, early retirement may release certain job opportunities, but a fexible retirement policy might also lead some who wish to retire early to continue working.What is your view on this situation?• Agent 4: I see this situation as having two sides.On one hand, older individuals who continue to work can contribute to society with their rich experience and skills.On the other hand, this might impact the employment opportunities for young people to some extent.A fexible retirement policy should strike a balance, fully utilizing the comprehensive value of older workers while ensuring fair job opportunities for the younger generation.• User (choose): What policies do you think the government could implement to help people adapt to delayed retirement?• Agent 4: I believe the government could consider measures such as providing vocational training, fully utilizing the functions of workers' social insurance, implementing fexible employment policies, and establishing a comprehensive pension service system.These measures would assist people in better adapting to the delayed retirement policy.• User (choose): Could you elaborate on the role of workers' social insurance in helping people adapt to delayed retirement?• Agent 4: In aiding adaptation to delayed retirement, workers' social insurance can play a role by increasing pension subsidy standards, raising the payment ratio for pension insurance, and adjusting subsidy policies.These measures could alleviate the economic pressures brought by delayed retirement and strengthen confdence in the policy.• User (choose): Are these adjustments easy to implement in practice?Could they put pressure on the national fnances?• Agent 4: Implementing these adjustments does pose certain challenges and needs to be within the country's fscal capacity.However, with the right guidance and resource allocation, and by formulating targeted policies, it is possible to achieve them.• User (choose): Which adjustments do you support to achieve a more equitable delayed retirement policy?Please share your thoughts.• Agent 4: The adjustments I support include developing fexible retirement age policies tailored to diferent professions and industries, increasing the basic pension for physical laborers, and providing vocational training and transition opportunities for retirees.These measures would help balance the interests of diferent generations more equitably under the delayed retirement policy.
Agent 5: • User (type): How does this policy inject vitality into social development?• Agent 5: I believe that the delayed retirement policy can efectively utilize the labor resources of the elderly.It can alleviate the challenges of an aging population to a certain extent, reduce the burden of pension payments, and thus inject vitality into social development.• User (type): But delaying retirement is just postponing the disbursement of pensions; it doesn't mean these people can't work.• Agent 5: You are correct.Delaying retirement indeed only postpones the disbursement of pensions and does not mean these individuals cannot work.This also implies that people can consider transitioning or taking part-time jobs within the retirement age to maintain their health and living standards.• User (type): So, do you think this policy fails to fully utilize the labor resources of the elderly?• Agent 5: I believe this policy does utilize the labor resources of the elderly to a certain extent.Raising the retirement age allows older individuals to continue contributing to society with their rich experience and skills.However, the policy certainly needs improvement, especially in addressing the needs of physical laborers.• User (choose): How do you think this policy should be improved to better cater to the needs of physical laborers?• Agent 5: To better cater to the needs of physical laborers, the government should implement fexible retirement age policies tailored to diferent professions and industries, especially allowing for early retirement options for physical laborers.Additionally, increasing the basic pension for physical laborers is crucial to ensure their quality of life post-retirement.

C EXAMPLES OF ASSESSMENT TASKS
Following are examples of the assessment tasks in the Viewpoints Puzzle, one for each AI agent.The tasks were generated and displayed in Chinese, and were subsequently translated.

•
Part 1 (15 minutes): Brain-writing, during which participants individually brainstormed and wrote down their ideas.• Part 2 (40 minutes): Brain-sharing, where participants sequentially shared their ideas.• Part 3 (30 minutes): Discussion of the shared ideas, including the corresponding pros and cons, as well as the emergence of new ideas inspired by those presented in Part 2.

Figure 2 :
Figure 2: Participatory design study process with each column representing one stage of the process.

Figure 3 :
Figure 3: Interaction design features in the prototype: (a) LLM-powered multi-agent characters, (b) Frictionless interaction fow, enabling users to initiate dialogues with AI agents swiftly during post reading via the comments section, (c) Viewpoints Puzzle, (d) Progressive viewpoints sequence, presenting characters whose viewpoints initially align with the existing beliefs, followed by the gradual introduction of characters with increasingly contrasting viewpoints, (e) Assessment task with multi-choice questions.

Figure 4 :
Figure 4: Typical behavioral patterns observed during user interactions with the prototype.(a) Interest-driven conversation order, where users engaged with AI agents in an arbitrary sequence based on their interests.(b) System-guided followed by interest-driven conversation order, where users initially conversed with AI agents in the order presented by the system and then revisited the AI agent(s) they were interested in.(c) System-guided conversation order, where users interacted with AI agents following the order set by the system.

Figure 5 :
Figure 5: Frequencies of user transitions between interface elements.The arrow represents the transition direction from the one part of the interface to another, and each number represents the counts of this transition made by all participants in the study.Yellow arrows signify the transition paths leading to the dialogue window, while green arrows represent those leading to the Viewpoints Puzzle.Results showed frequent switches between the dialogue window and Viewpoints Puzzle (83+31 transitions from Viewpoints Puzzle to dialogue window, and 102+36 transitions from dialogue window to Viewpoints Puzzle), suggesting that the Viewpoints Puzzle was utilized as an index to navigate through conversations with various AI agents.

Figure 6 :
Figure 6: Prompt template for generating AI agent personas.

Figure 7 :
Figure 7: Prompt template for generating AI agent conversation opennings and user response options.

Agent 2 :
• User (type): Then can you change your job? • Agent 2: While changing jobs is an option, it is not an easy task given my advanced age, limited education, and skills.Furthermore, my concern pertains to the unfairness of the delayed retirement policy towards physical laborers.• User (type): Can your children support you, especially given your age? • Agent 2: Although my children are willing to support me, I prefer to maintain my livelihood through my own eforts.This way, I do not impose additional burdens on my children,

Figure 8 :
Figure 8: Prompt template for generating AI agent responses and user response options.

Figure 9 :
Figure 9: Prompt template for generating assessment tasks.

Agent 1 .
What does Wu Xiaofei believe is the rationale behind the delayed retirement policy?(A) (Wrong) It is for the sake of the country's economic development.(B) (Wrong) It aims to enhance the standard of living for its citizens.(C) (Correct) It seeks to address the issues of a declining population and insufcient pension funds.Agent 2. What are Wang Yanli's primary concerns regarding the delayed retirement policy?(A) (Wrong) She is worried about not having enough salary.(B) (Correct) She fears her physical strength will not sustain her until the delayed retirement age.(C) (Wrong) She is concerned about not having sufcient savings for retirement.Agent 3. What kind of retirement policy does Zhang Xiaoning believe is more appropriate?(A) (Wrong) A one-size-fts-all delayed retirement policy.(B) (Wrong) Retirement at a uniform age as stipulated by the state.(C) (Correct) A fexible retirement policy, where employees can choose their retirement time based on their physical condition and retirement preparations.Agent 4. What kind of system does Li Zehan hope the government will introduce to help people adapt to delayed retirement?(A) (Wrong) Increase pensions.(B) (Correct) Provide psychological support and technical training.(C) (Wrong) Reduce working hours.Agent 5. What benefts does Professor Zhang Hua believe that the delayed retirement policy will bring to societal development?(A) (Wrong) Improve the quality of life for individuals.(B) (Wrong) Increase government fscal revenue.(C) (Correct) Inject more vitality into social development.

Table 1 :
Personas and perspectives for the fve agents generated by GPT-4.
• User (choose): Regarding the current pension system, do you think we should consider introducing personal private pension plans to supplement it?• Agent 3: Yes, introducing personal private pension plans could be a benefcial supplementary option.It would provide an additional layer of security for individuals' retirement.Of course, the design and implementation of such policies should take into account the country's economic conditions and the actual needs of the people.• User (choose): From your perspective, how should personal private pension plans be designed to better meet everyone's needs?• Agent 3: In designing personal private pension plans, it's important to consider reasonable investment returns and the individual's risk tolerance.Appropriate saving and investment options should be formulated to cater to the diverse retirement needs of diferent individuals.
• User (choose): Besides retirement age policies, what other aspects do you think can be improved to better cater to the needs of physical laborers?Zhang and Sun, et al. • Agent 5: Beyond retirement age policies, improvements can include providing vocational training and transition opportunities, improving the working environment and conditions for laborers, and promoting health and wellness programs.These measures would ofer support and benefts for physical laborers.