Communication, Collaboration, and Coordination in a Co-located Shared Augmented Reality Game: Perspectives From Deaf and Hard of Hearing People

Co-located collaborative shared augmented reality (CS-AR) environments have gained considerable research attention, mainly focusing on design, implementation, accuracy, and usability. Yet, a gap persists in our understanding regarding the accessibility and inclusivity of such environments for diverse user groups, such as deaf and Hard of Hearing (DHH) people. To investigate this domain, we used Urban Legends, a multiplayer game in a co-located CS-AR setting. We conducted a user study followed by one-on-one interviews with 17 DHH participants. Our findings revealed the usage of multimodal communication (verbal and non-verbal) before and during the game, impacting the amount of collaboration among participants and how their coordination with AR components, their surroundings, and other participants improved throughout the rounds. We utilize our data to propose design enhancements, including onscreen visuals and speech-to-text transcription, centered on participant perspectives and our analysis.


INTRODUCTION
Augmented Reality (AR) sits at the forefront of technological innovation, spanning multiple fields, such as education [2,12,20], healthcare [10,16,32], entertainment [19,26], and more widely other creative industries [13,50].Notably, the application and use of AR to support and enable simultaneous collaboration and shared views between multiple users are some of the criteria that have been extensively studied (see, e.g.[11,23,33,38,49,64]).The concept of shared AR (S-AR) comes into play here, which entails a co-located setting where users share a view of virtual objects overlaying the physical world [51].Collaborative AR (C-AR), an extension of S-AR [7], involves simultaneous interaction with virtual objects within the real environment [56].It can further be divided into two categories based on the positioning of the users: co-located, where users would be present in the same physical location, and remote, where users would be distributed in multiple physical locations.Co-located and remote AR are then further divided into two categories: synchronous and asynchronous.Given the context of our study, which revolves around synchronous collaborative AR, we will focus more on this kind.Synchronous collaborative AR is an environment where multiple users are positioned in the same 3D space while interacting with the technology and each other in real time.However, communication and collaboration aspects among users within co-located CS-AR are, to date, overlooked within the surrounding literature.
Co-located collaborative shared AR (CS-AR) environments have been proven to be beneficial in facilitating communication and collaboration among multiple users in a variety of scenarios [59,63] including multiplayer gaming settings [6,65].Nonetheless, a notable gap in the current literature exists regarding the effectiveness of such environments in facilitating engagement among linguistic minority groups, such as Deaf and Hard of Hearing (DHH) people.Even though there have been a plethora of studies where AR has been used to facilitate assistive environments for DHH people (e.g., [3,25,66]), studies are scarce to understand how DHH people communicate, collaborate, and coordinate in a co-located CS-AR environment.Additionally, previous studies emphasizing communication and collaboration aspects in AR environments [27,40,65] often excluded DHH participants.This knowledge gap necessitates increased scholarly attention to enhance the user experience and make co-located CS-AR environments more inclusive.Furthermore, recognizing the lived experiences of the DHH community, often marked by a reliance on visual communication strategies like sign language and facial expressions, is crucial.The social implications of accessibility challenges in AR, where visual and auditory elements predominate, contribute to isolation and impact various aspects of life for the DHH population.Addressing these challenges is essential for creating inclusive AR experiences that consider and accommodate the diverse visual communication needs of the DHH community, thereby ensuring equitable access across educational, recreational, and professional domains.
Our study revolves around a game probe (Urban Legends), a multiplayer co-located CS-AR game developed by a USA-based AR technology and locative game developer Niantic.17 self-identified DHH people with heterogeneous hearing abilities and diverse modes of primary communication (e.g., spoken English, written English, sign language, etc.) participated in our user study, which consisted of five rounds of gameplay, followed by one-on-one interviews.Participants could choose from the two roles of the game, and the main objective of the game was to free allies from shackles.Through our analysis using reflexive thematic analysis (RTA) [9], we found themes regarding communication, collaboration, coordination of DHH participants in a co-located CS-AR gaming environment, and design implications for this particular environment from the viewpoint of DHH users and our analysis.
The novelty of our work is to explore how DHH users communicate in a co-located CS-AR environment in a multiplayer gaming context where their primary modes of communication are diverse, and there is no common technology for communication embedded in the environment; how it affects their collaboration among themselves; and how they coordinate with each other, with their surroundings, and with AR components.We also focused on the challenges DHH participants face in this specific environment and their recommendations and design implications to overcome the challenges.Our work can further impact future research regarding accessibility and inclusivity in co-located CS-AR environments, such as education, training, industries, etc., surpassing the gaming context.
The key contributions made by our work are as follows: (1) Provides an initial exploration of communication dynamics among DHH users in co-located collaborative AR environments through a user study.We specifically demonstrate:

BACKGROUND AND RELATED WORK
In the following section, we will present 1) existing literature about DHH users and AR, 2) prior works on communication, collaboration, and coordination of DHH people in shared space as the theoretical framework of our study, and 3) previous research on collaborative, co-located, and shared AR.

DHH Users and AR
AR projects virtual elements onto the physical world to enhance user experiences [4].Several studies have explored AR as an accessible technology (technology that incorporates accessibility to accommodate users with disabilities [54]) for DHH users, aiming to provide them with a similar level of ease and adaptability as users without disabilities.Several studies have expanded beyond script transcription to address sound identity.Guo et al. [2020] introduced HoloSound, an AR prototype that employs deep learning to visualize sound identity, location, and speech transcription on HoloLens.Their emphasis extends to UI, system exploration, and haptic feedback.Li et al. [2022] introduced SoundVizVR, which visualizes sound characteristics and types in a virtual reality (VR) environment.They found the Full Mini-Map and On-Object Indicator to be the most effective combination for sound source visualization in VR.These studies highlight sound-related AR and VR application innovations for enhanced accessibility and communication.
From the work mentioned in section 2.1, which was conducted to make communication more accessible for DHH people, it is clear that these studies paid more attention to the system's design, implementation, accuracy, or users' interaction with the system.No AR system or environment has been used to demonstrate how DHH users communicate among themselves in AR settings.This further highlights the considerable gaps in our knowledge.However, several studies have been conducted to understand how DHH people communicate and collaborate among themselves and with their hearing peers in shared environments.This existing knowledge can further be beneficial in understanding how DHH people communicate, collaborate, and coordinate in collaborative, co-located, and S-AR environments.Conclusively, the existing research on communication, collaboration, and coordination of DHH people among themselves and their hearing peers in a shared environment illustrates a comprehensive picture of the challenges, innovations, requirements, and potential for inclusive interaction.Co-located CS-AR can act as a shared environment among DHH people and their hearing peers, with enormous potential across various fields (e.g., academia, industry, entertainment, etc.).Nevertheless, this remains an uncharted domain, much like the broader communication and collaboration aspects in co-located CS-AR environments.

Collaborative, Co-located and Shared AR
The communication and collaboration aspects of co-located CS-AR environments are still under-explored.Existing studies in this field involve collaborative, co-located, and shared AR in contexts such as education, entertainment, professional workspace, etc.For instance, Van der Stappen [59] proposed a C-AR learning game called MathBuilder, where children from elementary school collaborated to solve math problems.López-Faican et al. [2020] explored competitive versus collaborative play's impact on communication and motivation among primary school children using Mobile Augmented Reality (MAR).Their findings included positive emotions in both game modes, even though the collaboration mode had a greater impact on emotional affection, social interaction with coordination, and interest.They further recommended some design implications and suggestions that could be considered in future studies of MAR-based gamification strategies in educational settings.
Huynh and colleagues [27] conducted a study in a game setting (Art of Defense, AoD) placed in a co-located collaborative AR environment that could simulate social interaction, including verbal and non-verbal communication (e.g., hand gestures and body language), even if the participants are strangers when they start the game.ARVita is another tabletop AR software introduced by Dong and colleagues [14], a tabletop setting to watch and engage with dynamic visual simulations of engineering processes with HMDs.Their study focused more on the implementation of the technology than on users' interactions with each other or the technology.In another study, Wells et al. [2020] tried to determine how co-located group activities using a mobile AR interface were impacted by the varying complexity of AR models.While they found AR can support collaboration in a co-located group setting, the lack of collaboration mechanisms can negatively impact the collaboration.In that case, groups focus more on trying to find ways to overcome the issue.
Bhattacharyya et al.
[2019] presented a model for designing S-AR experiences along with the issues of developing such models and the major categories of interaction in an S-AR environment.Another study by Xu et al. [2008] focused more on social interaction by creating a prototype called BragFish and demonstrating participant strategies for social play using visual, aural, and physical cues.Franz et al. [2019] found enhanced engagement among participants using S-AR views compared to those using private AR views in a museum context.Furthermore, Bhattacharya et al. [2019] analyzed group dynamics during Pokémon Go raiding, highlighting supportive and challenging aspects of ad-hoc group formation.On the other hand, to understand non-verbal communication in shared and social spaces, Maloney et al. [2020] conducted a study on social VR, and reported on the factors that make social VR unique and socially desirable.
While research on communication and collaboration behaviors in co-located CS-AR environments is limited, there is a noticeable gap in including participants with DHH individuals, which we aim to address.

METHOD 3.1 Data Collection
Our data collection process was divided into two main sections: 1) a user study and 2) one-on-one interviews.The gameplay user study allowed the participants to experience the game in a co-located collaborative AR environment, and the post-user study interviews enabled them to share their thoughts on the experience.
To initiate our user study, we selected Urban Legends, a prototype of a co-located CS-AR game for which we obtained early access from Niantic.According to Niantic, this is a glimpse into the future of AR and 5G networks [57].
The game uses a combination of tangible AR, S-AR, and colocated C-AR, which lets the participants interact with the AR components of the game [8], share a common view along with the ability to manipulate them simultaneously [51] while remaining in the same physical space [36].One game session can have a maximum of six participants, lasting at most 240 seconds.The dynamic AR components of the game require participants to physically move or take action quickly.participants can see their health bar on their screen.The game's primary objective is to cast spells to battle monsters and save imperiled allies within the time limit [58].The user interface of the game is shown in Figure 1, and the game's roles and activities are further described in Table 1.
Table 1: Game roles of Urban Legends and their activities.

Game Roles
Offense Support

Spells
Ice Bomb and Ice Shard Heal Aura and Snowball

Attacking Spell
Ice Shard; causes a great deal of damage to the enemies Snowball; causes lesser damage to the enemies compared to its counterpart of the other role

Assisting Spell
Ice Bomb; allows participants to defend themselves and other participants within close proximity in physical space Heal Aura; works by igniting a magical aura around the player activating it, that can heal the damage done by enemies to themselves as well as anyone who enters the aura or are in close vicinity in real environment In this Player versus Environment (PvE) game, all participants collaborate as a unified team, striving to overcome and defeat all the monsters on the opposing team.The goal is to secure victory by defeating all the monsters within the given 240-second time constraint.The game starts with the first player initiating a session, and others join by scanning the QR code on the first player's device.Each participant then selects one of the two available roles.The primary responsibility of the first player is to position the shackled yeti in a specific location and scan the surroundings, referred to as "localization" (see Figure 2).After the first player enters the game, the other participants have to scan their surroundings to find the yeti, and they can enter the game as soon as the yeti shows up on their screen; this is localization for the other participants.When all the participants enter the game, the game session begins instantly (see, Figure 3).Over the course of the game, players must overcome relatively less powerful monsters (the jackalopes) to obtain two keys, which are essential for freeing the restrained yeti (see, Figure 4a and 4b).Once the yeti is liberated, the team must confront the "final boss" (the firefly) to achieve victory (see, Figure 4c).Failure to complete all these tasks within the 240-second time frame results in a loss for the entire team.3.1.1Participants.We deployed a screening questionnaire implemented in Google Forms (see, Supplementary Materials, section 1), distributed through email and flyers among college students.We asked participants for their names, pronouns, if they were DHH or not, and their familiarity with AR products.In the second step, based on the collected data, we contacted participants who identified as DHH and asked them to let us know when they could participate in the user study.Additionally, we informed them the user study would take place in a group setting, where they would have to play the game with other DHH participants with varying levels of hearing abilities, and other participants' preferred methods of communication might not match theirs.We did not include non-DHH people in our research because we were focused on understanding the experience and needs of DHH players.Furthermore, research has highlighted the potential harms that can emerge, such as perpetuating ableism when adding additional participants as a control group [42].Once we determine the access needs of DHH players, we can design future studies to further investigate group dynamics when non-DHH players are present.Based on the responses, we contacted eligible and consenting participants who would also be available to join the one-on-one interview.Altogether, 17 self-identified DHH participants joined the user study and interviews, aged between 19 to 32.Participants mentioned various modes (spoken English, written English, and American Sign Language) as their preferred methods of communication.Five participants were deaf, and the rest were hard of hearing; 14 of them mentioned spoken and written English as their preferred choice for communication, and the rest preferred written English and ASL.Since the majority of the participants were compatible with verbal communication, it allowed us to simulate real-world situations where people with diverse hearing abilities frequently collaborate.Moreover, it is important to note that spontaneous interactions with unfamiliar participants could arise in real-world scenarios while playing the game, where participants might have different communication preferences.They were put into seven groups; three groups had three participants, and the other four had two participants.Each group of participants was a combination of DHH people with heterogeneous levels of hearing.Figure 5 shows two groups of participants playing the game in a public space and a private space, respectively.Please refer to Appendix A for more information about the participants' demographics and group distribution.
3.1.2Procedure.We formed the groups solely based on the availability of the participants, where they did not have any scope to know about other participants in the group before the user study.Most of the participants (13/17) met each other for the first time, except two participants [P04G02 and P05G02] in group 2 and two participants [P13G06 and P14G06] in group 6 who were familiar with each other.We intentionally refrained from acquainting participants within a group before the game to investigate if the environment could stimulate social interaction among DHH participants who were previously unfamiliar, similar to its impact on non-DHH participants [27].The participants were offered 30 USD Amazon e-gift cards as compensation for their time.
Each group of the recruited participants played five rounds of the game, the first two of which took place in a smaller private area and the other three in a wider public one.This distinction was aimed to explore whether people moving around them affected their overall gaming experience and participants' space preferences.We gave them a general overview of the game and the design of the user study; however, they had to explore the abilities and activities of each role in the game.They had a discussion period before each round to pick roles and discuss strategy.
The communication method they would employ before and during the game was also left to them.We observed that the groups where all the participants could communicate using spoken English (Group02, Group05, Group06, and Group07) used verbal means to communicate among themselves.However, groups with participants who did not prefer spoken English (Group01, Group03, and Group04) used random hand gestures, body language, speech-totext applications, and text applications on their smartphones.Their hand gestures and body language were not sign language, as the group participants were not all proficient.
We recorded their gameplay with an external camera and the screens and audio of their devices for further data references and took notes on our observational data.After the gameplay, they were told to pick a suitable time and mode (verbal or text-based) for a one-on-one interview, and based on their preferred mode, we conducted the interviews through Zoom [1].Since the interview was semi-structured, we could further improvise the questions based on each interviewee and the user study.We recorded the interviews to detect any data discrepancies.
Most of our interview questions (see, Supplementary Materials, section 2) were about the participants' gameplay experiences, issues they faced while collaborating, communicating, and coordinating in the game, and design implications for addressing the issues from their perspective.

Data Analysis
The collected interview data was transcribed and cleaned after cross-referencing with video recordings and participants' responses.Any discrepancies, such as in localization time or game round duration, were noted.We brought observational data from the user study, transcripts of participants' interviews generated from video recordings of the interviews, and additional notes we took during interviews together on Nvivo.Using Nvivo, we coded data sets through tagging, facilitating the identification and characterization of recurring themes and patterns.Employing the reflexive thematic analysis (RTA) approach [9], we organized codes to develop themes.As our analysis progressed, our familiarity with the data grew, providing increased interpretive flexibility to discern new patterns.
In the initial coding process, we employed semantic and latent coding [31].The participant's explicit interview data served as the source of the semantic codes.We started labeling the data as a summary of the participants' statements, such as "collaboration during discussion time", "felt not communicating enough was an issue", and "felt frustrated when other participants took longer to find yeti".Additionally, we developed the latent codes by contrasting our observations and notes from the user study and interview with the participant-provided explicit interview data.Initially, we started with 240 codes, a mix of latent and semantic codes.After going through several iterations of the initial codes, we later refined them into 170 codes.These codes were then organized into three main themes: 1) the dynamics of interplayer communication and collaboration strategies; 2) various dimensions of coordination; and 3) future preferences and recommendations from the DHH user's viewpoint.The dimensions of coordination are a) with AR components, b) with surroundings, and c) with other participants, which are the three sub-themes of one of the major themes.Our thematic map is shown in Figure 6.
Additionally, it is important to acknowledge our position as researchers; we recognize that our team is not part of the DHH community.While our affiliation with a local Deaf institution facilitated the involvement of DHH students in our research, we acknowledge the potential bias in our interpretation of results due to our outsider position as hearing individuals.By grounding our research in the feedback and experiences of DHH participants, we strive to provide valuable insights while inviting critical evaluation and feedback from the DHH community to enrich future endeavors.Participants had to communicate in a real environment outside of the game.As they took action (e.g., attack the enemies, dodge an enemy attack, step into the healing aura to boost health, etc.), it would reflect in the game environment of AR.We observed them using verbal cues, gestures, and body language as their methods of communication before and during the game.Participants who could communicate verbally did so using verbal cues before the game; otherwise, they used gestures and body language.Three out of the 17 [P01G01, P07G03, and P09G04] participants could not communicate verbally; they were competent in ASL but were unaware whether other participants in the group could use ASL.Interestingly, we observed that all the participants who could not communicate verbally came prepared with a speech-to-text app on their smartphones.Whenever they had to participate in complex communication, such as selecting a role, discussing game strategy, and expressing their comprehension of the game roles during the discussion period before each round, they used the speechto-text app to comprehend what other participants were saying verbally.When it was their turn to communicate, they typed their thoughts on their smartphones and showed them to other participants.P01G01, who was deaf, said later in a text-based interview, "We did the gestures by pointing them out to each other, and sometimes we used texting to communicate.

" [P01G01]
The key findings from all three deaf participants were that the game was fast-paced, and they did not have time to type for communication, so they opted out for gestures when they wanted to communicate with others during the game.Participants who could communicate verbally (14/17) used verbal cues mostly for communicating before each round.P04G02 explained, "During the discussion time, we communicated the entire time verbally.The other people in my team showed no difficulty in just communicating verbally, so that was the method I defaulted to.

" [P04G02]
Another notable discovery was that, when interacting with participants who communicated non-verbally, various methods like speech-to-text, typing on the phone, gestures, and body language were employed.P02G01, who played with a non-verbal participant, mentioned, "I tried to communicate verbally and with hand gestures.I noticed that one of the participants was definitely deaf, and I don't really know a lot of ASL.The only problem, I would say, was figuring out how to communicate so that everyone could understand.So I had to rely on either hand gestures or texting on the phone.

" [P02G01]
During the game, they used verbal cues to communicate with other participants who were comfortable with spoken communication; otherwise, they used hand gestures.In our code, we annotated it as "gestures during the game."However, more than half of the participants (10/17) mentioned not communicating much during the game.P03G01, who played with one participant who could communicate verbally and one who could not, said, "To be frank, we didn't really communicate with each other much during the gameplay.I feel like we would have communicated more if we knew each other, but we didn't.

" [P03G01]
Another participant, P1507, who played with two other participants who could communicate verbally, said, "I don't really think there was that much communication necessary during actual gameplay." [P1507] Even though there was little perceived communication during the game, only one participant [P08G03] mentioned it as an issue, saying, "I don't think we had the best communication during the game.So that was an issue.

" [P08G03]
In contrast to this sentiment, we observed and recorded all the participants using body language or hand gestures during play to communicate with others in the game.As participants were focused on their own activities in the game, communication may have been less prioritized in their recollections of the play.After all rounds, we observed two participants [P02G01 and P11G05] using ASL to convey their names, even though they could communicate verbally.

Collaboration Strategies.
We divide collaboration during the user study into two phases: (i) before each round of the game and (ii) during the game.More than two-thirds of participants (12/17) reported equal collaboration during role selection discussions between rounds, as elaborated by P14G02, "I felt like we're pretty even.I don't think either of us did more than the other." [P14G02] Two of the participants, from Group 01 [P02G01] and Group 02 [P08G01], mentioned that they were more outspoken during the discussion time and initially led the discussion about roles and strategies, but they would not describe it as aggressively leading either, which can be clarified by a comment from P02G01, "It was mostly me commenting on strategy and noticing the things with the additional members of the team.And then me saying, "Hey, this is the best idea, " and no one really tried to veto that.

" [P02G01]
Like the communication aspect, our coding highlighted another prevalent theme: participants collaborated less during the game, even though more than half (9/17) identified it as the best aspect of the game.Comments from P17G07 further solidified our observation, "I'd say it [collaboration] happened in one of all five rounds, and again, we mostly communicated before each round[...], but asking for help, I think it only happened once in every play that we did.

" [P17G07]
The participants' concept of collaboration during the game revolved around assisting one another throughout the gameplay.Only two participants [P10G04, P13G06] stated there was plenty of collaboration during the game; on the other hand, 8 out of 17 participants mentioned they did not often collaborate, while one participant [P15G07] said he did not collaborate at all during the game.P16G07 said, "Because nobody was in danger of dying if someone were in danger of dying, I think we would communicate more.

" [P16G07]
We labeled comments like this as "limited collaboration due to the game's design," indicating that participants perceived the game's design as a factor contributing to restricted collaboration during gameplay.P12G05 also brought up an intriguing point, "We knew what to do, and I didn't need help that much in the later rounds." [P12G05] This suggests participants adapted and became more familiar with the game through repeated rounds.

Coordination With AR Components, Other
Participants, And The Surroundings While 15 out of 17 participants faced difficulties in scanning and finding the yeti, two participants consistently located the yeti first.Common issues included prolonged scanning times and localization only working from specific places.Six participants mentioned the slow localization process, suggesting the need for settings retention between rounds.Another five participants noted the second issue, emphasizing finding a "sweet spot" for scanning and the necessity for all participants to gather in a particular place.
Despite these challenges, communication, primarily through gestures and body language, played a crucial role in resolving localization issues.Text-to-speech apps were not used due to potential disruption of the gaming session.Over multiple rounds, more than half of the participants found localization easier, with reduced setup time.
After starting the game, physical movement in the game space was required to coordinate with various components, such as attacking enemies, dodging attacks, grabbing the key, and freeing the yeti.Some participants (4/17) reported issues like enemies spawning behind or too close to them.P15G07 further explained movement in the game, "I found myself moving to a more strategic location that was not necessarily covered by the other two participants.

" [P15G07]
Only one participant (P04G02) experienced a negative impact on gameplay, mentioning tunnel vision and becoming accustomed to looking in a straight direction.
Overall, despite initial challenges, participants adapted to the game dynamics, demonstrating improved performance and communication over multiple rounds.

With Surroundings.
Each data collection event consists of five rounds of play in two different settings.The first two rounds were in a lab setting, which was a smaller private area with a few pieces of furniture, and the last three rounds were in a bigger open hallway, which was a public area.Seven out of 17 participants preferred the private setting, but only one mentioned it was wide enough.Two participants stated that they had some safety concerns in the private area, like "bumping into the furniture or another person" or "very easily bumping into something and dropping something", while another participant [P05G02] mentioned that game components are affected by the smaller space as "you get crumbled and you don't get to see the enemy." On the other hand, the rest of the participants (10/17) preferred the hallway, due to its spaciousness, absence of furniture, and open layout, making movement easier.This public area, unlike the private one, attracted non-participants, leading to occasional interruptions from passersby during gameplay.While most participants (14/17) were comfortable with being observed, over half (9/17) found nonparticipants distracting.Minor issues highlighted included the need for participants to avoid obstructing non-participants' paths, interference with gameplay due to non-participants, disruption to AR components, and non-participants attempting to observe out of curiosity.

With Other participants.
Most of the participants (13/17) met each other for the first time, except two participants [P04G02, P05G02] in group 02 and two participants [P13G06, P14G06] in group 6, who were familiar with each other.Participants had to communicate with each other to decide the roles and strategies, but as they were strangers, they felt hesitant in the beginning.However, as they played a few rounds, they could communicate more easily, which led them to choose an appropriate role and strategies more effectively.
Participants randomly chose the first player in six out of seven groups.In Group 06, technical issues with a participant's device led to a consistent choice of the first player.When it came to choosing the role, 16 out of 17 participants defined one role (offense or support) as preferable to the other, and all the participants attempted both roles at least once.However, even if they had the freedom to choose whichever role they wanted, participants often chose the role that was needed to win the game rather than picking the role they preferred.Additionally, they did not face any kind of conflict while doing so.For instance, P04G02 said, "On our first attempt, we did two support and one offense, mostly because everyone was picking roles at random.But after our first attempt, we realized that the healing area and the cooldown, at that point became two offense and one support.Because one support was more than enough to take care of two, three, four, five, six people (offense).And without them, we would be in danger of dying." [P04G02] However, he indicated further that he preferred offense better.Additionally, all participants experienced positive changes as rounds progressed: a better adaptation of the game objective (15/17), smoother communication (14/17), faster gameplay (13/17), better strategies (10/17), shortened discussion period (9/17), smoother localization (9/17), and better collaboration (3/17).Participants who indicated that their discussion time was shorter in the last few rounds compared to the first few rounds also mentioned that they knew what the roles did and "pretty much went through the repetition that needed to."

Future Preferences And Recommendation
From DHH Users' Viewpoint Most participants (13/17) mentioned that they would like to play an S-AR game, similar to the probe we used for this user study, in the future.However, half (7/13) indicate a desire for more content and resolution of technical issues observed in the prototype.We found that participants had varying spatial preferences for future experiences: wide spaces (4/17), private spaces (3/17), public spaces (4/17), or semi-private spaces (2/17).In terms of whom they would like to play with, the majority of the participants (11/17) preferred to play with their friends as it would be easy to communicate and collaborate with them.The rest of them (6/17) were willing to play the game with strangers.Still, half of the participants who wanted to play with their friends (6/11) wouldn't like to play this kind of game very often as it could be repetitive and might lose the player's interest very quickly.Participants suggested some ideas in the game design aspect that could further improve communication during the game.For example, an on-screen indicator to show the health of other participants (5/17), the location of enemies out of the participants' view (2/17), and live-captioning for DHH participants (1/17).They all agreed this could be useful in a setting where the participants could not perceive auditory cues.Surprisingly, only one player suggested an embedded chatbox, which is more frequently used in other multiplayer games as a means of in-game communication.One of the possible reasons why most participants (16/17) did not suggest the embedded chatbox method is that typing may take longer than speaking or even using gestures or body language as a form of communication.Moreover, the average duration of each round was 120 to 150 seconds, excluding the localization process.This made the game fast-paced, and engaging in typing during the game would consume valuable game time.

Multimodal Communication Between participants
We observed that a shared co-located AR environment where participants must constantly move to take action can foster social interaction through verbal (spoken) and non-verbal (hand gestures and body language1 ) communication.This observation remains consistent even when participants begin the game as strangers, aligning with prior studies on board games in co-located S-AR environments [27].Participants engaged in multimodal communication during the game, utilizing both verbal and non-verbal interactions in the physical world.Notably, all the interactions took place outside the game due to the absence of an embedded in-app communication system (e.g., chatbox, voice chat, in-game mails, etc.).Furthermore, the game could facilitate spontaneous interactions with strangers, where participants had varying communication preferences.Among our participants, concerns that other participants could not understand their primary means of communication drove multimodal communication.Improvised nonverbal techniques (hand gestures and body language) and mutual attunement developed gradually as play commenced to serve communication needs, a process similar to the one described by Wang et al. [2018].
The majority of participants were comfortable with and favored verbal cues, but some had a preference for non-verbal communication.Participants who could leverage verbal cues did so before and during the game.However, when they noticed that some participants could not engage in verbal communication, they naturally transitioned to non-verbal methods (e.g., hand gestures, body language, and text-based communication).These non-verbal cues played a vital role in augmenting and supplementing the communication process for our participants who relied on visuals such as sign language and lip reading.Additionally, most of the participants started as strangers and were initially hesitant to communicate with each other.As they advanced through the rounds, their comfort level increased, leading to smoother and more cohesive communication.This trend aligns with findings from studies in co-located AR settings, including board games [27] and raiding in Pokémon Go [5].
Non-verbal communication played a positive role in a co-located CS-AR environment, similar to social VR environments [43].Participants employed smartphone speech-to-text apps for pre-game discussions involving intricate topics like role selection and strategy formulation, with deaf participants particularly driving this method.Earlier research validates this behavior [34], where sign language users devised communication behaviors and modified their primary mode of interaction.However, in the final rounds before and during the game, hand gestures and body language became the primary non-verbal communication method.As participants grew more familiar with roles and game components, pre-game complex conversations diminished, refocusing communication on simpler interactions.The consistency of hand gestures and body language throughout gameplay was attributed to the game's straightforward nature, eliminating the need for intricate conversation in the later rounds.Additionally, participants refrained from converting speech into text and responding with text due to the session's brevity and the game's fast pace.

The Relationship Between Communication And Collaboration
Similar to existing work related to C-AR [40,59,63], Urban Legends successfully fostered communication among strangers, which in turn influenced the extent of collaboration among participants.Before and during the game, in both phases, communication played a vital role in determining the amount of collaboration.Most communication occurred in the first few rounds before the game started.participants collaborated to discuss strategies and roles for the round before each round started through verbal and non-verbal communication.participants demonstrated patience and mutual assistance in their collaboration.For example, we observed them repeating game strategies to ensure accurate interpretation by the speech-to-text app.Here accessibility acted as a collaborative practice and participants who used verbal cues shared the responsibility of creating an accessible CS-AR environment, as argued by existing work [61].Each group had to play five rounds, and in the first few rounds, they tried to understand the game mechanism and the functionalities of each role during the gameplay.However, compared to the discussion before the game, less communication (verbal and non-verbal) was seen during the gameplay, resulting in less collaboration in this phase.
In-group communication among participants became smoother as they progressed through more rounds.For instance, they knew which hand gesture indicated someone required healing and which one signaled "enemy behind you".The finding again drew attention to the fact that participants gradually learned to collaborate in a mixed-ability group [61] in a CS-AR setting.Even though collaboration depended on communication, smoother communication (verbal and non-verbal) did not play a vital role in increasing the amount of collaboration as the participants went through more rounds.Aside from when the participants ran into technical issues (e.g., game lag, sudden shutdown of the game, failure to enter the game session) due to the game being a prototype, in the last few rounds, the overall level of communication got even lower as the participants became more independent and accustomed to the game.

Evolution of Coordination Behavior
We observed coordination in Urban Legends from three major perspectives: i) with AR components, ii) with surroundings, and iii) with other participants.
As rounds progressed, participants, including DHH individuals, showcased improved coordination in the CS-AR game.They discovered specific spots for optimal scanning and initiating the session.Participants collaborated, using both verbal and non-verbal cues, to identify these spots collectively.Their willingness to work together aligns with findings in previous literature [47,59].Despite challenges like AR component issues (e.g., yeti moving or enemies spawning unexpectedly), participants adapted by adjusting their physical movements and attack angles.
The user study occurred in two different spaces to assess participant preferences.The majority favored the second space, an open public hallway, citing safety concerns in the first space, a private lab setting with a few pieces of furniture.Participants' frequent physical movements during in-game actions influenced their choice of a spacious area to avoid collisions and focus on the game.This aligns with findings by Shin et al. [2019], suggesting larger spaces enhance presence and narrative engagement, while furnished spaces increase perceived workload in AR games, regardless of participants' hearing ability.The presence of non-participants in the game space was noted but didn't distract participants, who remained cautious not to obstruct paths.
Coordination among the participants started when they participated in the discussion before the first round of the game.Being mostly strangers with a surface-level introduction to the game, they randomly chose the roles and who should be the first player.However, after the first round, participants were more synchronized in selecting roles and strategies, supporting existing research [27].Towards the end, participants developed preferred roles but remained flexible to assist others.Here, we observed solidarity and shared objectives.Rounds and interactions grew shorter throughout each session, with reduced collaboration and communication in later rounds.This further indicated that the participants were getting accustomed to the game and could play independently without other participants' assistance.

Challenges And Their Effects Throughout
The Game One of the significant challenges the participants ran into was finding a common way to communicate.We aimed to simulate realworld scenarios in the game, where often unfamiliar people with diverse hearing abilities and communication preferences would play together in groups.As a result, most of the participants met for the first time when they came to participate in the user study, and they did not know whether other participants favored their preferred way of communication.However, they overcame this challenge through multimodal communication, described in section 5.1.
Participants ran into challenges due to several technical issues, similar to other AR games [46] that hampered their communication, collaboration, and coordination during the game.For instance, almost all of the participants ran into problems regarding localization, and they tried to communicate with each other and collaborate to solve this.Initially, participants encountered challenges in finding effective non-verbal communication methods with others, leading to initial awkwardness during their first encounters [61].However, they found a way to quickly solve this issue in later rounds as they got accustomed to the game process, and their coordination improved as the rounds progressed.Furthermore, a few participants were facing game lag, and the game shut down suddenly, throwing them out of the game session.They could not solve the lag, and the other participants remaining in the game played until the session was completed.These sudden technical issues cut down their communication and negatively affected the participants' coordination among themselves and with the AR components.

Design Implications For DHH Users
The findings reveal opportunities and priorities for enhancing accessibility and inclusive gameplay experiences.This section translates the study takeaways into design-oriented guidance by proposing and discussing specific interface enhancements, adaptations, and accommodations tailored for DHH users.

Support Multimodal Communication.
A key finding was DHH participants' reliance on blended multimodal communication, including verbal cues, non-verbal gestures, and text chat.Co-located collaborative AR systems should facilitate flexible communication across multiple modalities, such as verbal, visual, and haptic [52], etc., to accommodate diverse needs.For verbal communicators, including auto-generated speech-to-text transcription features [22,37] can ensure text captions are available for deaf participants.Gesture and icon menus complement physical gestures, while spatialized ambient visual cues convey non-verbal signals such as attention directions or warnings.Haptic feedback also helps translate key audio events through the sense of touch.However, contrary to many other multiplayer games [18,28,62], participants did not advocate for having a text-based in-game messaging system during the game in real-time, which further parallels with the results found from the survey conducted by Elliot et al.[2016].Additionally, it is important to acknowledge that integrating sign language interpretation [41,60] technologies into the game might be beneficial.However, we have limited insights into how effective it could be in facilitating communication among a diverse hearing group.Moreover, opting for an in-person interpreter can serve as a viable alternative in CS-AR scenarios, allowing the interpreter to participate in the game as a non-player.In this way, the interpreter does not take up a player's role but can still assist in interpreting the game context through sign language for the players.Nevertheless, given the impromptu nature of this game, which can occur without prior planning, arranging in-person interpreters might pose challenges due to their availability and the absence of pre-arrangements.Moreover, interfaces should support these communication modes early on with structured icebreakers and accessible trials, allowing teams to evaluate the most effective combinations of verbal, gesture, text, and other strategies.5.5.2Prioritize Customizable Visual Information.Participants suggested employing an on-screen cue to indicate the enemies' position when not visible on their screens.The finding supports the results of prior study [39,55] proving the necessity of visual components for DHH participants to independently grasp the situational context, share non-verbal information, and improve their awareness, and sense of social presence [21].Designers should provide visualcentric communication options while empowering DHH users to customize their presentations based on personal needs and preferences [48].Examples include adjustable captions [29], movable floating icons [35], player-placed map beacons, and camera feeds of fellow participants' faces/hands.Giving control over visual cue positions, sizes, colors, and formats accommodates the group's heterogeneity.Optional muting of distracting visuals also helps users manage input.These measures help compensate for limited audibility.
5.5.3Simplify Coordination and Raise Awareness.Findings showed that DHH participants required effort to coordinate virtually with game components, physically with environments, and socially with other participants.Streamlining orientation, navigation, enemy visibility, and team member tracking can relieve these burdens.Persistent off-screen indicators [45], mini-maps, spatial audio representations (as visual cues [24,30]), and visual avatars for virtual objects lower barriers.Designated play zones, distinct spatial audio cues, and physical barriers/guides ease physical coordination in hybrid spaces.Lastly, broad situational awareness, mutual gaze support, avatar/identity representations, and familiarity from repeated teaming help participants master social coordination.By leveraging these suggested strategies, co-located C-AR and spatial computing at large can become more welcoming and empowering for the diverse communication needs of DHH player communities.

LIMITATIONS AND FUTURE WORK
In our study, we selected participants and formed groups based on their availability, and each group had participants with varying levels of hearing abilities and different preferred modes of communication.There might be differences in the findings if we grouped our participants based on their hearing levels or favored method of interaction.For example, forming groups with deaf participants exclusively would result in varying findings than what we already had.Furthermore, signers often gravitate toward one another in shared spaces, and including signer-signer pairs could have revealed additional communication dynamics.However, our priority was observing how signers adapted their communication across modes, so the groups did not include pairings of two signers, which would have increased the realism of modeling natural social dynamics within the DHH community.Additionally, most of the DHH participants' primary mode of communication was verbal, which limited our exploration of established visual communication modes, such as sign language (although our participants used gestures).Furthermore, all the participants were college students accustomed to AR technology, even if the particular AR probe used in the user study was new to them.However, participants' comprehension of AR environments, cooperation strategies, and willingness to engage in social interactions could influence and introduce biases in assessing challenges.
It is crucial to emphasize that, given the game's nature, participants could successfully play and win with relatively minimal communication.Additionally, they refrained from recommending the inclusion of written communication methods, such as a chatbox in the game, citing its fast-paced nature and brief sessions.Nevertheless, it is crucial to recognize that our findings may not apply to scenarios demanding more intricate communication, particularly in situations with prolonged game sessions that exceed the capacities of nonverbal modes.
In the future, we want to recruit participants from more diverse backgrounds (e.g., age, occupation, knowledge of AR, pairing signers with signers) to have a balanced number of participants from different demographic groups.In this study, we identified challenges and design implications, particularly for the gaming context; however, future work should focus on a different co-located collaborative AR environment and the challenges DHH people face.We anticipate finding some environment-specific and general challenges that participants will face in both environments.Moreover, in the future, involving non-DHH participants in user studies could offer valuable insights into a diverse set of research questions through realistic scenarios.Additionally, as the dynamics of collaborative AR games evolve, there is a potential for DHH participants to employ more elaborate multimodal communication strategies.This becomes particularly relevant in games with accelerated pacing, intricate objectives, and longer game sessions, where the need for richer communication exchanges may arise.Exploring these scenarios will offer deeper insights into the applicability of our findings and design implications, especially in situations where nonverbal cues alone may prove insufficient for effective coordination.
Furthermore, we recognize the potential for our findings to spark new avenues of exploration within the field of human-computer interaction (HCI).Specifically, this research could catalyze game designers and developers to explore innovative approaches to improvised communication as integral elements of game design and playtesting methods.By emphasizing the playful potential of such communication early in the design process, we hope to inspire new HCI methodologies that improve the overall user experience in AR applications and gaming scenarios.

CONCLUSION
The present article explores how DHH people communicate, collaborate, and coordinate in a multiplayer game environment in a co-located CS-AR setting.We recruited 17 DHH participants who gained firsthand experience through a user study involving Urban Legends, a co-located CS-AR game.Participants later shared their experiences in a one-on-one semi-structured interview.Our findings from the gathered data illustrated the communication and collaboration dynamics, the evolution of coordination, the challenges participants faced and their effect, and finally, some design implications to make this environment more inclusive for DHH users.Our findings can be fruitful in future research related to accessibility and diversity within the field of co-located CS-AR environments beyond gaming spheres.However, our study had some limitations due to demographic biases and unexpected technical difficulties because the probe was a prototype.In future work, we will focus on the challenges DHH participants face in a co-located collaborative AR environment, leveraging a different AR environment as a probe.We anticipate identifying more generalized and environment-specific challenges and exploring ways to overcome these challenges from the users' point of view to make these environments more inclusive.
(a) How DHH participants leverage multimodal communication by blending verbal and non-verbal cues based on each other's needs and capabilities.(b) The interrelationship between communication and collaboration, with increased communication fostering enhanced collaboration among DHH participants.(c) Coordination strategies DHH participants employ to overcome challenges and enhance gameplay over time.(2) Offers design implications, such as auto-generated speechto-text transcription and customizable visual information, grounded in DHH participants' experiences and suggestions, for improving communication accessibility in co-located CS-AR games.

Figure 1 :
Figure 1: (a) Two roles of the game, (b) a player playing as support, and (c) a player playing as offense.

Figure 2 :
Figure 2: First player (a) trying to set down the chained yeti to start the game, (b) finalizing the position of the yeti, and (c) scanning the surroundings to start the game.

Figure 3 :
Figure 3: Another player (a) waiting for the first player to finalize the position of the yeti, (b) scanning around the place where the first player positioned the yeti, and (c) still trying to find the yeti when the other two participants are in the session.

Figure 4 :
Figure 4: During the game (a) jackalopes are attacking the players, (b) one key to partially free the yeti has appeared after defeating a certain number of jackalopes, and (c) the firefly has appeared.

Figure 5 :
Figure 5: Two groups playing the game, (a) in a public setting, and (b) in a private setting.
Peng et al. [2018]ing substantial interest (90%) among deaf participants.Jain et al. [2018]extended this idea, allowing DHH users to customize real-time captions' appearance and placement in 3D space within AR.Their study emphasized enhanced glancability, visual contact, and access to visual data.Similarly,Peng et al. [2018]contributed by developing a system that optimally arranges, displays, and visualizes real-time speech in AR, even for speakers outside the field of view.These endeavors collectively illuminate the potential of AR to cater to the needs of DHH people.
For instance, Mirzaei et al. [2012] presented a system combining AR, ASR, and TTS, transforming spoken words into visible text on the AR display in real time.They focused on system accuracy across Previous research has identified common challenges faced by DHH individuals in video conference settings, offering guidelines to mitigate these issues.For instance, Kushalnagar et al. [2020], Jazz Ang et al. [2022], and Kim et al. [2023] suggested usage of live captions and transcripts, visual and haptic feedback, and expressive icons respectively.Additionally, Keating et al. [2008] demonstrated adaptive communication behaviors within the Deaf community, adjusting sign language based on webcam proximity and visual attributes.Their insights highlight the interplay between technology and visual language use, collectively enhancing the understanding of communication dynamics for DHH individuals in various shared contexts.