Forging Productive Human-Robot Partnerships Through Task Training

Productive human-robot partnerships are vital to successful integration of assistive robots into everyday life. Although prior research has explored techniques to facilitate collaboration during human-robot interaction, the work described here aims to forge productive partnerships prior to human-robot interaction, drawing upon team-building activities’ aid in establishing effective human teams. Through a 2 (group membership: ingroup and outgroup) ×3 (robot error: main task errors, side task errors, and no errors) online study (N=62), we demonstrate that (1) a non-social pre-task exercise can help form ingroup relationships; (2) an ingroup robot is perceived as a better, more committed teammate than an outgroup robot (despite the two behaving identically); and (3) participants are more tolerant of negative outcomes when working with an ingroup robot. We discuss how pre-task exercises may serve as an active task failure mitigation strategy.


INTRODUCTION
As assistive robots transition from autonomous tools into collaborative partners [15], it is imperative for robots and users to maintain productive partnerships.A productive human-robot partnership depends on the closely coupled concepts of taskwork (task performance) and teamwork (shared trust and cohesion for the purpose of achieving mutual goals) [19] with collaboration success crucially dependent on trust [25,39].For example, teamwork can aid in maintaining efective collaboration in cases of failed taskwork.Shared understanding within a team encourages luency, trust, and efective communication, resulting in better team performance [40,65].A frequent complication for maintaining productive collaboration as a result of harming trust is robot errors.Despite the expanding advances in robotic technology, robot errors are inevitable; it is estimated that robot errors with negative impacts on task success occur every few hours in the ield [7].Robot errors may present a safety concern, damage task performance (impact taskwork), harm people's perceptions of robots, and erode a user's trust and willingness to work with the robot (impact teamwork) [4,6,42,53].However, by manipulating the relationship between human and robot (e.g., preemptively setting functionality expectations of the robot [69] and blame assignment [32]) using mitigation and repair techniques (e.g., fault justiication [13] and explanation [14]), robot errors' negative efects can be mitigated and trust can be maintained.
Working in close collaboration with robots, especially when they make errors, necessitates an understanding of how being part of a group impacts a user's behavior, perceptions, and interactions.Social identity theory in human-human interaction involves concepts concerning inter and intra group dynamics such as group membership and intergroup relations.A group member's social identity is deined by the group's connecting factor and the diference between them (ingroup) and other groups (outgroup) [29].Therefore, members aim to maintain a positive social identity by biasing towards comparisons in favor of the ingroup over the outgroup [67].However, if this comparison is unfavorable towards the ingroup, members will look to switch groups or alter the ingroup [29,67].We deine a group as two or more individuals [21] and, in this paper, we speciically explore a group size of two as is done in much of the current work in human-robot collaboration and group membership [31,57].
To date, research has explored how forming group memberships may help create more productive human-robot partnerships [57]; in particular, prior works reveal that people ind it easier to collaborate with and exhibit preferential bias for an ingroup robot [20,22,28].While this is encouraging, previous research has largely focused on studying group membership in social human-robot interactions (e.g., [56]) with anthropomorphic robots and has relied upon using the minimal group paradigm (wherein the smallest feature, such as color, is used to create a group) [35] or social games [8] to form a human-robot group relationship.It has been, however, unclear whether group membership can be formed through setup and training processes commonly used for non-anthropomorphic robotic manipulatorsÐsuch as calibration and kinesthetic demonstrationÐand if the efects of group membership can be observed in the context of non-social interactions with said manipulators (e.g., during the collaborative performance of physical tasks).In addition, little work has explored the impacts (beneicial or detrimental) that group membership can have on a human-robot team notwithstanding robot errors during non-social interactions.
In this work, we seek to answer two questions: 1) Can people form group membership with a non-anthropomorphic robot through non-social interactions in an efort to improve the human-robot partnership?and 2) How does group membership impact the taskwork and teamwork of a human-robot team, especially when challenged by robot errors?We conducted a mixed-design online study wherein participants collaborated with ingroup and outgroup robotic manipulators (within-subjects factor: group membership) to make pizza (Fig. 2).To form an ingroup relationship with the robot, participants were instructed to calibrate and demonstrate pick-and-place operations (the underlying actions in the pizza-making task) to the robot (Fig. 1).During the collaborative pizza-making process, the robot would make main task, side task, or no errors (between-subjects factor: robot error) (Fig. 3).This error manipulation allowed us to examine how group membership may inluence taskwork (involving main task errors) and teamwork (involving side task errors), studying how a robot's erroneous actions outside the immediate task at handÐbut not the teamÐafect people's interactions with and perceptions of the robot.Our results indicate that participants were able to successfully form group membership with the robot through non-social, task-oriented interactions (calibration and demonstration) and that participants favored and were more tolerant of the negative task outcome precipitated by the ingroup robot than outgroup.Our indings suggest the potential of using a pretask robot training exercise to form group memberships in order to positively shape users' perceptions of robots in mitigating undesirable task outcomes.Next, we review the relevant background and related prior works that motivated this work.

BACKGROUND AND RELATED WORK
In this section, we discuss the theory behind group membership and intergroup diferentiation and prior research associated with the formation of group membership and beneits in human-human and human-robot interaction.In addition, we review the efects robot errors have on human-robot teams, how relationship manipulation can change its efects, and how group membership is afected by these errors.

Group Membership in Human-Human Interaction
The concept of group membership (ingroup vs. outgroup) stems from human-human interaction and is deined by an łus versus themž mentality.Individuals are motivated to form ingroups to distinguish themselves as better than other groups [29].An ingroup is formed and maintained through social categorization, identiication, and comparison.A person's social categorization is typically based on common attributes or social factors.This leads to conformation to the group's norms and social identiication as a member inds that they emotionally belong to the categorization [29,67].While these groups are generally formed due to common social values, groups can also be established on the minimal group paradigm, based on arbitrary details rather than a social characteristic [18].Social comparison occurs by pinpointing important attributes in which the ingroup difers from outgroups.There are three factors that afect the assessment of intergroup diferences: an individual's emotional investment in the group, the context for the direct comparison, and how reasonable the outgroup is as a point of comparison [67].All of the processes involved in creating and sustaining ingroup membership are for the purpose of maintaining a positive social identity [67].As a result, people exhibit ingroup bias by prioritizing members of their own group over those of other groupsÐrating ingroup members more favorably and cooperating with them better [24,44,66].
However, because groups are formed based on collective social identity, when a group member threatens that identity (a łdeviantž member), the rest of the group passes harsher judgment than on a member of an outgroup.It is known as the Black Sheep Efect (BSE) [38].It is important to note that this efect only appears if the group's social identity is threatened.In fact, if the bad behavior does not afect social identity, people are more likely to forgive ingroup than outgroupÐthe opposite of BSE [51,52].Both ingroup bias and BSE arise from the group's inherent desire to maintain a positive social identity [67].

Group Membership in Human-Robot Interaction
Human-robot group memberships have been successfully created based on shared characteristics between people and robots such as a robot's supposed nationality ( [20,28]; group size two), school ailiation ( [47]; group size two), and major ( [59]; group size two).Moreover, as with human-human interaction, groups composed of humans and robots can be shaped by łsocially non-relevant featuresž (minimal group paradigm) [35,36] (group size three or more); for example, assigning people to teams by color (i.e., [45]; group size two) or arbitrary team assignment based on experience ( [17]; group size two).
However, group membership has also been formed through social team building activities commonly seen in human-human interaction.A typical ice-breaker activity of łtwo truths and a lie, ž which fostered the perception of anthropomorphism, was shown to successfully create ingroup membership (group size three) with a mobile robot [8].Another method for group formation was through goal setting and role clariication, which was shown to create human-human and human-autonomous agent teams (group size two) with this łformal team buildingž improving task performance between two type of team building activities.Furthermore, teams built through łinformal team buildingž, having the participants and their partners do an unrelated collaborative game, were investigated but the diferences in performance among no and either team-building activity were not explored [68].
The biases seen in human-human interaction are relected in social robotics during human-robot interaction under some circumstances.Users can apply the same łsocial categorizationsž to robots, resulting in them favoring the anthropomorphized ingroup robots more than outgroup [20].Group membership, in human-robot collaboration, can afect the ease of interaction and extent of collaboration.Participants cooperated more with and found it easier to collaborate with such an ingroup robot than an outgroup one [28].Group membership can also inluence how members behave, with a study observing that a vulnerable robot could cause the other human members of its group (group size four) to also exhibit vulnerability [63].Ingroup bias can beneit human-robot collaboration because it has a large efect on users' opinions of a robot and willingness to work with it again.For example, ingroup robots have been shown to be favored over outgroup humans (group size four).Additionally, during a competition scenario, users evaluated ingroup robots more positively than outgroup ones, even when users had to compete against the ingroup one [22].
Group membership also beneits trust in the robot.An ingroup intelligent system was trusted more than an outgroup one, regardless of the level of reliability these systems provided [70].These results suggest that group membership (group size three) could be used to moderate trust: participants should work with the ingroup system when they łundertrustž and work with the outgroup system when they łovertrustž [70].Ingroup membership could also compensate for task failure, with regards to assessment of robot intelligence; participants rated the ingroup robot during task failure similarly intelligent to that of an outgroup robot during task success [8].This was contrary to the BSE, which as discussed in the next section, other social robotic studies did observe.

Robot Errors and Their Impacts on Group Membership
As mentioned previously, robot errors are inescapable, afect both eicient taskwork and teamwork, and negatively impact trust [27].Prior work has shown avenues to mitigate the efects of errors on trust through altering aspects of the human-robot relationship and user's perception of the robot.For example, setting a user's expectations of a robot (low vs. high) before a study impacted the efectiveness of trust recovery post error with low functionality having greater trust recovery [69].The assignment of blame post error can also manipulate user's trust in a robot [32].In addition, common mitigation techniques for robot errors are done through social means even with non-social robots/tasks, such as asking for help (e.g., [33]) and acknowledgement of error (e.g., [49]).
There has been some research exploring the role group membership plays with regards to a łdeviantž robot: demonstrating the negative impact ingroup bias can have.A łdeviant robotž is deined as one that has low warmth and competence [59].Through a social robot scenario (group size two), a deviant ingroup robot was shown to be favored less than an outgroup one, exhibiting BSE.Moreover, it was found that the robot's competency was the driving factor for the robot's evaluation [59].While this research did explore how competency afects the perception of ingroup and outgroup robots, there was no objective way for the users to evaluate robot competency.

Knowledge Gap
Prior work has shown that humans and robots can form groups that beneit the humans' perceptions of the robots, trust, task performance, and collaboration quality.However, the downside to establishing an ingroup and exhibiting ingroup bias is that humans could judge the ingroup robot harsher, when the robot's competence is low.All of the aforementioned research was conducted in the context of social scenarios or using social human-human interaction techniques to form groups, largely with humanoid robots.Robot embodiment afects user perceptions in the face of robot errors [34] and anthropomorphism impacts trust [42]; thus, research needs to be conducted to look at group membership in non-social scenarios with non-anthropomorphic robots.Furthermore, there has been little work done associated with direct manipulation of robot errors and the efect group membership has on the perception of the robot that makes the errors and the errors themselves.
This paper ills that gap by introducing group membership and its formation into non-social scenarios and with non-social robots.We determine that we can create groups through non-social, end-user robot training and show the beneits of taking the time to do that activity in the face of negative task outcomes.

METHODS 3.1 Hypotheses
As motivated by prior research, we pose four hypotheses regarding 1) ingroup formation in non-social scenarios with a non-social task and 2) the resulting beneits regarding robot errors (impacting taskwork and teamwork) and negative task outcomes.Empirical evidence has shown that ingroup relationships with social robots can be formed by the minimal group paradigm and social team-building activities [8,36,68].Consequently, successful group membership creation results in a preferential bias toward ingroup robots in error-free scenarios [20].In addition, through a non-social task, users will form a mental model of the robot [65], where prior work has shown that transparency resulting in an accurate mental model can aid in collaboration in areas such as users identifying reasons for failure [46] and understanding robot intentions despite unclear verbal communication [43].Our hypotheses are: H1: Group membership can be formed with a non-social robot via a non-social training task performed before the main task.
H2: Users working with an ingroup robot will exhibit an ingroup bias toward that robot.Moreover, the Media Equation theory [48] as well as the łComputer are social actorsž paradigm [41] profers that people behave toward computers as they do in human-human social relationships as they elicit similar social attributes.In the context of human-robot interaction, previous research indicates that when working with non-social robots, people respond socially to technical robot actions [23, 60ś62].Thus, we pose the additional hypotheses: H3: When working with an ingroup robot, participants will be more tolerant of errors than when working with an outgroup robot, regardless of the error setting (main or side task).
H4: If the errors are severe enough, participants will judge the ingroup robot and its errors harsher than the outgroup robot and its errors (the black sheep efect).

Experimental Design.
The study had a 2×3 mixed factorial design with two factors: group membership (within-subjects factor: ingroup and outgroup) and robot error setting (between-subjects factor: no errors, side task errors, and main task errors).We counterbalanced the level of group membership participants had with the robots and randomly assigned participants to one of the three robot error settings.

Task Context.
We contextualized our investigation in a time-trial pizza-making competition, wherein participants teamed up with two robots (an ingroup and outgroup robot, one at a time) to complete two sets of three pizza orders each (one set per robot).This competition scenario allows for intergroup diferentiation by enabling intergroup comparison (head to head competition) of relevant attributes (time it takes to complete a pizza and task success) with a similar outgroup (same robot model and behavior) [67].In addition, we added pressure by informing participants through written instructions that they were competing against another human-robot team and that the fastest team would win to encourage collaboration and add additional stakes to the task.In reality, there was no other team; the participant-robot team, regardless of which robot the participant worked with, always łlostž causing a negative task outcome.Since the task was a time-trial, we also added a countdown  The robot retrieved ingredients that its user requested (B) while the participant dragged and dropped ingredients to make the pizza (C).When the pizza was fully assembled, it was then placed in the oven (D).If the robot was ever idle for two seconds, it began executing its side task of placing clean dishes in the dish rack (E).The trial ended when the pizza was taken out of the oven to be served (F).
timer to the task as the addition of time pressure has been shown in prior work to be an efective way to increase the severity, or riskiness, of the environment and task [6].
At the beginning of the experiment, each participant irst encountered a waiting screen signifying that another participantÐtheir competitorÐwas in the process of joining; this setup was executed in an efort to provide realism to the competition despite the absence of another participant in actuality.The participant was then asked to select one of two robots that they would subsequently train.The robots were colored and named diferently to distinguish them; speciically, the yellow robot was named łSunž and the blue robot was named łOcean.žThe robot that the participant chose to train (see Fig. 1 for the training worklow) was considered the ingroup robot and the other the outgroup robot.
Fig. 2 visualizes the overall task low; for each individual pizza order, the participant requested the ingredients and tools required to create the pizza and the robot would bring the items (Fig. 2B) in the correct order for the participant to use.The participant then dragged and dropped the ingredients to a location (a pizza pan) to łmakež the pizza (C); if a progress bar appeared while an item (e.g., a rolling pin) was being dragged, that item needs to be rolled for three seconds to be added to the pizza.These interactions were explained to the participant during the tutorial phase and they completed a practice run before undertaking the actual task.(See Section 3.5 for a detailed description of the study's procedure.)Once the pizza was fully assembled, a łbake in the ovenž button became interactable (D), shortly followed by a łdonež button for the order (F); each pizza baked for 16 seconds.In addition to the main task of making pizza, the robot had a side taskÐplacing dishes on a drying rack (E)Ðwhich did not involve the participant; this side task was triggered when the robot was idle for two seconds.
3.2.3Manipulations.We experimentally manipulated group membership and robot error setting thereby directly manipulating the social categorization of the group (via group membership) and social identity (via robot errors) [67].
Group Manipulation.To establish group membership, each participant was asked to select one of two robots, difering in name and color, to train on a series of pre-task exercises related to the main pizza-making task; they were notiied through written text that the other robot would be trained by another participant.The overall low of the training session is shown in Fig. 1.The training session was separated into two sections; the irst portion (Fig. 1A) involved the participant dragging the robot through a series of waypoints as if łcalibratingž the robot.The second part (B) consisted of the participant łteachingž the robot to select ingredients from the fridge and place them on the table by dragging it through a sequential path; this process is analogous to the method of kinesthetic teachingÐwherein a user demonstrates motion trajectories by physically maneuvering a robotÐcommonly used in end-user robot programming by demonstration [3].At the end of the training session, the robot (C) demonstrated to the user that it had been trained successfully by retrieving ingredients from the fridge as requested.In addition to forming group membership, training helps the user to establish norms and mental models about the group's (particularly robot's) behavior and forming a social identity.It is important to note the training had no impact on the robot's resulting behavior; while the participant believed their chosen robot was being trained by their eforts, the two robots ultimately behaved exactly the same.
Error Manipulation.Each participant was randomly assigned to one of three error settings: main task errors, side task errors, or no errors.The error setting changes whether the group's social identity has positive or negative connotations as it will impact robot performance.For the main or side task error conditions, the robot would make two mistakes spaced out amongst the three pizza orders; Fig. 3 illustrates our error manipulations.Main task errors were made while the robot retrieved ingredients or tools needed to assemble the pizza, thereby directly impacting the task.Side task errors were made when the robot executed its side task of placing dishes in the drying rack.These errors occurred when the participants were waiting for their pizzas to bake and, therefore, not working on the main task.Although they did not impact the user's main task, they may have potentially lessened the participant's trust in the robot, thereby undermining the teamwork as a whole.Both types of errors were intended to be unexpected to the participant, as during both the practice and training sessions, the robot demonstrated it was able to perform all of its tasks without any errors.
We designed two errors ś wrong object (selecting an incorrect task object) and hesitation (the robot moving back and forth as if unsure of the selected task object) ś for both the main and side tasks.We conducted a prestudy (N = 11) to explore what kinds of errors the robot should make and to determine the various errors parameters (e.g., timing and duration of hesitation).In this prestudy, participants watched videos of a robot executing an ingredient retrieval and evaluated the errors made during the task; there was also one error-free video provided to ground the participants' mental models of the task.They were asked whether or not they had witnessed an error in the video and to describe it as a measure of how recognizable that error was; based on the results of the prestudy, we chose errors with the highest recognizability and severity on a seven-point Likert scale.

System Implementation
We used Unity [1] to create an interactive 3D application in which participants could interact with simulated robots; the application was published as a WebGL program that ran in participants' web browsers.Our interactive 3D application1 included a virtual kitchen environment with interactable task objects governed by Unity's physics engine.We animated the robots using the Cartesian trajectories of their end efectors and calculated their joint poses using inverse kinematics.Their behaviors were driven by a inite state machine, where states contained prerecorded animations of a task or erroneous behavior.To increase the realism of the simulated task and create a competitive atmosphere, we used auditory cues for the robots (e.g., motor sounds), cheering noises, and countdown sound efects.The same virtual environmental setup was used for both group conditions.
Forging Productive Human-Robot Partnerships Through Task Training • 9

Measures
We employed a set of objective and subjective metrics to measure how group membership inluences team performance and user perception of robot errors and teamwork (e.g., perceived teamwork, commitment, and relationship).These metrics allowed us to examine potential ingroup bias and BSE.
3.4.1 Manipulation Checks.We veriied our manipulations of group membership and robot errors were adequate for our experiment.
• Group Membership Check.As a group membership manipulation check, after working with each robot, participants answered two questions regarding which robot they had just worked with and which robot they had trained; these questions also served as attention checks.Participants who incorrectly answered these two questions were excluded from our data analysis.• Robot Error Check (binary choice).As a robot error manipulation check, we included one question after working with each robot asking whether the participants saw the robot make an error.We additionally had each participant rate their perception of the robot errors from one to seven; this is of speciic interest with regard to perception diferences between main and side task errors.
3.4.2Task Completion Time.Task time, measured in seconds, was deined as the interval between the start of the competition and the moment the pizza was taken out of the oven (Fig. 2).Task time was measured for each pizza rather than the overall time spent with each robot, as we intended this metric to quantify task eiciency.

Subjective Measures.
We relied on questionnaires to understand participants' perceptions of the robots as teammates and the impact robot errors and group membership had on those perceptions.See Appendix for a breakdown of the subjective measure scales we created.
• Perceived Teamwork (rating scale: 1ś6).This scale consisted of nine items (Cronbach's α = 0.95) that explored user perception of how good of a teammate the robot was.The complete set of questionnaire items used for this scale is provided in the appendix.• Teammate Commitment (rating scale: 1ś6).We created a three-item scale (Cronbach's α = 0.87) to examine user perception of the robot's commitment to the team.The complete set of questionnaire items used for this scale is provided in the appendix.• Psychological Closeness (rating scale: 1ś7).To assess the relationship between the participant and the robot, we adapted the Inclusion of Other in the Self (IOS) scale [5], where the self was represented by one circle and the robot by the other; we used this metric to gauge psychological closeness between user and robot, analogous to how prior work has used behavioral metrics such as proxemics (distance between self and robot) to measure comfort and preference (i.e., ingroup bias) [17].• Equal Partnership (rating scale: 1ś6).One statementÐłThe robot and I were equal partners in completing the task.žÐwasused to understand whether or not participants perceived themselves to be equal partners with the robot.• Willingness to Work Again.To measure a participant's willingness to work with the robot again, we employed a binary choice question and a six-point rating scale.At the end of the study, participants were asked which of the two robots they would want to work with again; this question directly assessed participants' robot preferencesÐi.e., an indication of ingroup bias.We additionally used the statement, łI would work with this robot again.ž to measure participants' speciic preferences.

Study Procedures
After consenting to participate in the study, participants were instructed to watch a tutorial video illustrating how to interact with the robot and navigate the simulated environment.They then illed out an initial questionnaire assessing the tolerance characteristic associated with their personality [2].Following the questionnaire, they were asked to choose the robot they would like to train and work with; that robot was designated as the ingroup robot, and the otherÐclaimed to be training with a competing participantÐwas classiied as the outgroup robot.Participants then trained their chosen robot following the procedure detailed in Section 3.2.2 and performed a practice task structured similarly to the main task, wherein participants would work with the robot to complete a pizza order.After the practice task, participants continued to work with either their ingroup or outgroup robot; the order of collaboration with the two robots was counterbalanced.Participants completed three pizza orders with each robot, after which they were asked to ill out questionnaires regarding their experience and perceptions of interacting with the robot.They then repeated this process with the other robot.After both rounds, the participants were asked to ill out a post-study questionnaire and their demographic information.The study was approved by our institutional review board (IRB).It lasted on average 27 minutes, and the participants were compensated $8 for study completion.

Participants
Seventy-six participants were recruited through convenience sampling.However, 14 participants failed our attention checks, which is within reasonable rates seen in other works associated with online user studies [10,26].Therefore, the results reported below are based on 62 participants (35 female, 25 male, 2 undeclared) with ages ranging from 18 to 63 (M = 27.2,SD = 10.9).Of those participants, 20 were in the no error, 21 in the side error, and 21 in the main error conditions.In addition to basic demographic information, we collected self-reported data regarding the tolerance of the participants' personalities and any prior experience with robots and technology.In particular, we adapted the IPIP tolerance scale depiction of Cloninger's Temperament and Character Inventory to assess participants' tolerant personality (Cronbach's α = 0.70) [2,11]; the included questions are listed in the appendix.On average, participants were high-medium tolerant of others (M = 4.36, SD = 0.63 on a six-point Likert scale, with 1 being łstrongly disagreež and 6 being łstrongly agreež).To gauge participants' prior experience with robots and technology, we used three questions (Cronbach's α = 0.67) listed in the appendix; we found that the participants had low-medium prior experience with robots and technology, M = 2.93, SD = 1.18.

RESULTS
For the analysis reported below, we used two-way repeated measures analyses of covariance (ANCOVAs) to examine the main efects of group membership (within-subjects factor) and error setting (between-subjects factor) and their interaction efects while accounting for two covariates: prior experience and tolerant personality.Previous research has demonstrated prior experience with robots afects people's perceptions of them, such as their level of trust in robotic technology [30,54]; as a result, we included prior experience as a covariate in our analyses.Additionally, people's personalities have been shown to inluence their perceptions of and comfort level with robots [50,64]; as such, we included participants' tolerance of others as another covariate, as it may impact their perceptions of a robot and its errors.Mathematically, we used a linear model that included the two main factors (group membership and error setting) and their interaction term, as well as the two covariates and their interactions with the within-subjects factor (group membership); the participant was also included as a random efect in the model.Mean centering was applied to the covariates as suggested by prior works [16,55].For post hoc pairwise comparisons, we used Tukey's HSD test.For ANCOVA results, we reported adjusted means and standard error.A p < .05 was deemed a signiicant efect; for efect size, we considered 0.10 ≤ r < 0.30, 0.01 ≤ η 2 p < 0.06, 0.10 ≤ ω < 0.30 as small, 0.30 ≤ r < 0.50, 0.06 ≤ η 2 p < 0.14, 0.30 ≤ ω < 0.50 as medium, and r ≥ 0.50, η 2 p ≥ 0.14, ω ≥ 0.50 as large [12].

Manipulation Checks
As mentioned previously, we did not include data from participants who failed the group membership manipulation check in our analyses.Below, we report our error manipulation checks.A two-way ANCOVA was conducted to assess the inluences of group membership and error setting on perceived error severity while controlling for participants' prior experience and tolerance.There was a signiicant main efect of the robot error setting on perceived error severity (F (2, 57) = 26.29,p < .001,η 2 p = .48).On a scale of 1 (least severe) to 7 (most severe), participants, on average, rated the perceived error severity for the main task errors setting as 3.53 (SE = 0.24, 95% CI [3.05, 4.00]), the side task errors setting as 1.84 (SE = 0.24, 95% CI [1.35, 2.32]), and the no errors setting as 1.13 (SE = 0.24, 95% CI [0.65, 1.61]).Pairwise comparisons using Tukey's HSD test showed main task errors were perceived to be more severe than side task errors (p < .001)and no errors (p < .001)(Fig. 4a).We did not observe any statistical efect of group membership on perceived error severity (F (1, 57) = 0.07, p = .79,η 2 p = .001),nor did we see an interaction efect of error setting and group membership on perceived severity (F (2, 57) = 0.35, p = .71,η 2 p = .012).Participants' tolerant personalities and prior experience with robots and technology were not signiicantly associated with their perceived error severity.Additionally, perceived error severity was positively correlated with perceived impact of errors on task success, r (122) = .82,p < .001,and eiciency, r (122) = .83,p < .001.
When asked whether or not an error occurred during the task (a binary indication of yes or no), 93.0% of the participants who experienced main task errors stated the robot made a mistake, whereas only 35.0% of the participants in the side task error condition stated the robot made an error (Fig. 4b).A chi-squared test revealed there was a signiicant efect of the robot error setting on what was considered an error (χ 2 (2, 124) = 58.57,p < .001,ω = .69).These results, along with the association between error manipulations and perceived error severity, indicate that our manipulations of robot errors were adequate.

DISCUSSION
Through this study, we examined the role a pre-task training exercise could play on the formation of group membership and the impact that has on the user's perception of a robot in a human-robot team.Our exploration reveals the beneits of taking the time, however brief, to do a group membership formation task with non-social robots before conducting a joint task.The results from the study support H1 (training task can form group membership) and H2 (participants will exhibit ingroup bias), do not support H3 (participants will be more tolerant to the robot's errors when working with the ingroup than outgroup robot regardless of error setting), and do not support H4 (participants will exhibit the Black Sheep Efect).

Group Formation and Ingroup Preferences
5.1.1Group formation via a pre-task training exercise.We observed that through common, non-social robot training exercises we were able to form ingroup relationships between participants and the robots that they trained; these results support H1.While participants were allowed to choose their robot (minimal group paradigm), which could aid in group formation, the group membership formation is critically dependent on the training task.Before running this study, we ran a pilot study (N = 12) where the manipulation for group membership was participants choosing their robot, no training task, and the robot did not make any errors; the rest of the study was the same as the current study.We did not observe any main efect of group membership on psychological closeness with the robot, F (1, 11) = 1.07, p = .32,η 2 p = .088,indicating that participants choosing their robot was not enough to form a group in a non-social interaction scenario.Therefore, the group formation we observed for this study was largely dependent on the non-social training task, similar to prior work showing social team-building tasks can successfully create ingroup membership [8].Our results demonstrate that a non-social team-building task can be efective in forming group membership.
To further highlight pre-task training as critical to ingroup formation, when asked for any additional comments at the end of the study, some participants mentioned the impact training had on their perception or preference.For example, one participant stated łI felt like the robot I trained, Sun, was faster in giving me items than Ocean.ž and another mentioned łI would give the robot another chance, [but] I wouldn't if it had been Ocean since I didn't train him.ž These statements were made even though the two robots were identical in terms of capability and behavior.The fact that both participants speciically highlighted that they trained their robots shows that training had a large impact on their observations about behavior and willingness to work with the robot again.From the results, we see that when given a binary choice, participants overwhelmingly chose the ingroup robot to work with again (81%).
5.1.2Benefits of group membership and ingroup preferences.Participants felt psychologically closer to the robot they trained than the one they did not, as shown in Fig. 5c.One participant stated, unprompted, łSun [ingroup robot] and I have a closer connection.ž This inding signals an ingroup relationship forming with the robot that the participants trained because the closeness metric is an indicator of participant closeness, comfort, and preference.Furthermore, participants exhibited ingroup biases as they were more willing to work with the ingroup robot again than the outgroup one and perceived the ingroup robot more favorably than outgroup in metrics such as teamwork, commitment, and partnership equality.Similar to prior work, our indings indicate preference towards an ingroup robot when forming group membership with social robots through the minimal group paradigm [36] or a social team-building task [8].
The creation of an ingroup robot beneits the human-robot team relationship.The ingroup robot is viewed as preferential to an outgroup one, supporting H2, despite the two robots in this study behaving identically throughout the task.Additionally, the beneits of a better ingroup robot perception were exhibited notwithstanding the fact that the inal result of the pizza making competition is a loss (task failure) for both conditions and regardless of whether there was an apparent robot error.We observed that, through the formation of this ingroup relationship, participants perceived the ingroup robot more favorably in several metrics.

Impact of Robot Errors
5.2.1 Impacts of errors on teamwork experience.As expected, both error settings negatively impacted the participant's perception of the robots as a teammate.A robot that made errors was seen as a worse teammate, less committed to the task, and not an equal partner.Ultimately, participants were less willing to work with the robot again.However, it is important to note that the errors had no impact on psychological closeness.

5.2.2
Diferences in main and side task errors.Participants perceived that the main task error was more severe than the side task one even though the two errors were exactly the same with the same impact on their corresponding tasks.This disparity shows that the assignment of severity is subjective and dependent on the participants' łvaluesž and not just the objective impact on task success.The diference in the perception between the two errors was further highlighted in that over half (65%) of the participants who experienced the side task condition felt the side task error was not an error as compared to 7% for the main task error, see Fig. 4b.

Error tolerance.
As stated previously, one beneit of forming and working with the ingroup robot is that participants perceive it more favorably than the outgroup one, despite that the participants lose the competition.This illustrates that, by forming a relationship and working with the ingroup robot, we can mitigate the impact of task failure (taskwork) on participant's perception of the robot, creating tolerance to negative task outcomes.However, there was no interaction efect between group membership and error type showing that the ingroup was perceived as a better teammate or more committed than outgroup irrespective of whether the robot made a mistake.This shows a unilateral preference for the ingroup robot, continuing to support H2, but not supporting H3.

5.2.4
The absence of Black Sheep Efect.We did not observe the BSEÐpassing harsher judgment when the ingroup robot makes an error as compared to the outgroup one.There was no interaction efect of group membership and error settings on perceived error severity and good teamwork, not supporting H4.If there had been BSE, then the ingroup robot's errors would be perceived as more severe than outgroup ones.A necessary antecedent to BSE is ingroup bias [59] and as discussed in the previous section, we saw ingroup bias, with the favorable perception of the ingroup robot.However, it is possible that the errors were not perceived as severe enough to impact the team's social perception or that the robot, unprompted, recovers from the errors right after they occur.It could also be that the robot was not perceived as having threatened group identity due to the type of activity (functional training rather than a social activity), the type of robot (not easily anthropomorphized), and the type of task/error (work related).The robot may have made a mistake, but it is just a functional mistake, not something that is transgressive or threatening to the team.

Implications for Human-Robot Teams
When working in close collaboration with a non-social, non-humanoid robot, we propose that users should alter their worklow to include a short training task before embarking on the actual task.The task can be as short as a couple minutes and as simple as a training task; it does not need to be a team building social activity typically used in human-human interaction.This task is not intended to train the user but rather to form group membership with the robot and as such it may need to be done with every new robot that the user works with.For example, if a robot that a user had been collaborating with for a while was replaced with a new one it would be worthwhile to repeat the pre-task training exercise to maintain a favorable perception of the robot, regardless of whether the new robot was visually or behaviorally identical to one the user had previously worked with.By forming group membership with the robot, users could possibly be performing a preemptive error mitigation technique in human-robot teaming, providing tolerance to overall task failures.

Limitations and Future Work
A limitation of this study was that the interaction time for the task as well as the training task was shorter than it would have been in real life.Also, the robot was simulated, which has been shown to elicit dampened emotional responses in people compared to responses to real robots [58].In-person studies are needed to conirm similar observed efects.In addition, the tasks took less efort and time to complete in simulation than they would in person and so, perhaps, lessened participants' reliance on the robots to complete the task and, in turn, afected the perception of the robot.Moreover, the task completed by the robot was a simple pick-and-place task and so we need to conirm if, with a more complex task, we would see similar efects.Another limitation is that the written instructions provided to the participant about the outgroup robot being trained by another person, during task introduction, could have potentially inluenced the group membership formation; additional studies could test this.Also, there are many possible pre-task exercises that could be done; it would interesting to explore the impact diferent non-social, pre-task activities have on group formation and ingroup bias.The military, which spends time looking at human teaming, could inspire future work to explore additional team-building exercises and human teaming strategies, as Walliser et al. did [68].In our study, the pre-task activity was related to the main task at hand, it would be valuable to see how an unrelated task, or a team-related activity, of similar length would impact the observed results.In addition, possible future exploration could test the bounds of this pre-task exercise method for group membership formation by looking at the impact that perfect vs. imperfect learning has on the efectiveness that this method has on inluencing group membership.The errors made by the robot were perceived by participants to be at most, on average, medium severe (3.53 on a scale of 1 to 7).Therefore, we do not know how group formation, ingroup bias, and tolerance would change in response to a more severe error or a higher stakes task as prior work has shown that efectiveness of error recovery is dependent on the task, error severity, and context [6].It is possible that would lead to observations of BSE.All of the interactions done in the study were short-term, continuous collaborations.It would be of interest to explore how longer term collaborations or a series of tasks could require repeats of the pre-task training activity to re-engage users with the ingroup robot and maintain group membership.Like many in prior work, this study only explores group membership through the lens of a partnership.As group size grows, the dynamics change and it afects cooperationÐparticipants exhibit diferent engagement behaviors during social robot scenarios and it introduces human-human interaction within the group [9,37,57].

CONCLUSION
This work presents a non-social pretask technique, robot training session, that forms ingroups to improve robot-human collaboration, particularly in the face of robot errors and negative task outcomes.The results show that we are able to create groups through a short (just over two minute), non-social, end-user robot training task causing participants to exhibit favorable perceptions of robots in teamwork, commitment, and willingness to work with them again.Through this relationship during a non-social task, participants are more tolerant of negative task outcomes.Consequently, we recommend that those who collaborate with robots should consider altering their worklow paradigm to incorporate a non-social team-building pre-task, which can be short, whenever working with a new robot even if the the robot looks and behaves the same as the previous one.This exercise will improve user perceptions of the robot and act as proactive mitigation for unfavorable task outcomes.
• The robot was not adept in completing its task.Teammate Commitment (Three items; Cronbach's α = 0.87) • The robot was committed to the success of the team.
• The robot was committed to the task.
• The robot was committed to winning the contest for the team.

Fig. 1 .
Fig. 1.The training process for the ingroup robot.Group membership was formed through a pre-task training activity aimed at preparing the robot for its role in the main task.Participants performed a waypoint-based calibration (A) and a kinesthetic demonstration of a pick-and-place task (B); the robot then demonstrated that it had learned the task correctly (C).
Fig.2.The pizza-making process.Participants and robots collaborated to make three pizzas per group membership condition.The robot retrieved ingredients that its user requested (B) while the participant dragged and dropped ingredients to make the pizza (C).When the pizza was fully assembled, it was then placed in the oven (D).If the robot was ever idle for two seconds, it began executing its side task of placing clean dishes in the dish rack (E).The trial ended when the pizza was taken out of the oven to be served (F).

Fig. 3 .
Fig. 3.A diagram of possible errors made by the robot, contingent on error condition.The robot in the main or side task error condition makes two errorsÐretrieving the wrong object or hesitating when placing the object.

Fig. 4 .
Fig. 4. Bar charts of the robot error check and task completion time data; the bars depict adjusted mean and the error bars represent the 95% confidence interval.(a) The impact of the main efect of group membership and error seting on perceived severity of error.(b) The distribution of participants for each error seting condition stating if they observed an error.(c) A plot showing the time taken to assemble a pizza for each trial, illustrating the learning efect across trials.(d) The impact of the main efects of group membership and error seting on task completion time for the fourth through sixth trials only (no learning efect).