Does Performative Autonomy improve Situation Awareness in Highly Demanding Collaborative Tasks?

In human-robot collaboration tasks, robots have the opportunity to communicate relevant information to human teammates, inform them about critical situations, and seek guidance during decisionmaking. Researchers have recently introduced the concept of Performative Autonomy, an autonomy design approach in which robots performs lower levels of autonomy than they actually possess (e.g., seeking advice that they do not genuinely require) to enhance human situational awareness. In our study (n = 404), we implemented Performative Autonomy in a resource management game, where the robot periodically interacts with human teammates in ways intended to enhance situational awareness. Unfortunately, the experimental testbed used in this work did not lead to demonstrated impacts of this strategy.


INTRODUCTION
The Performative Autonomy paradigm suggests that robots many sometimes operate with lower levels of autonomy than strictly necessary during human-robot collaboration, for the purpose of enhancing human Situational Awareness (SA) [2,6,7] without increasing their Mental Workload [14,19].In this work, we explore whether Performative Autonomy might achieve its desired efects in the context of a complex resource management game, and what its downstream impacts might be on other dimensions of interaction such as Task Performance.To pursue this goal, we designed a complex online experimental environment in which robots could "perform" lower levels of autonomy through strategic questioning.Within this experimental context, we sought to explore the following research question: R1: Can Performative Autonomy efectively improve Human SA during collaborative human-robot interactions, without imposing adverse Mental Workload or impairing Task Performance?
Unfortunately, perhaps due to the nature of our testbed, we were not able to replicate previous demonstrations of the efects of Performative Autonomy in this work.

RELATED WORK
If we specifcally consider performance of lower levels of autonomy through language capable robotshuman-robot dialogue, dialoguetheoretic approaches to autonomy design have a long history, including approaches like Mixed-Initiative Interaction [1] and Collaborative Control [8,9,13].In previous work Roy et al. [14] proposed six levels of dialogue autonomy inspired by [12] and [22].

Level Strategy
Speech Act 6 Selecting option without proposal (None) These levels (shown in Tab. 1, Col. 2) are arranged from exhibiting the highest level of autonomy to exhibiting the lowest level of autonomy.These six levels of dialogue autonomy can be understood through a Speech Act theoretic perspective [18], as the six levels align with six categories of Illocutionary Acts [17] (shown in Tab. 1, Col. 3).As such, this framework associates diferent choices of illocutionary acts with diferent levels of autonomy.Because each of a speaker's speech acts may take a diferent illocutionary force, an agent may demonstrate diferent levels of autonomy to achieve diferent social goals.This tension can be seen in the case of Indirect Speech Acts (ISAs) [16], whose literal and intended meanings can difer.For example, an indirect request like "Could you bring me a tape?", though literally a YN question (Yes/No Question), is conventionally understood as a request or command based on sociocultural norms.
It is yet unclear, however, how the specifc strategy explored by Roy et al. [14], Performative Autonomy, might generalize to other task contexts.Moreover, Roy et al. [14] only examined three of the six levels of autonomy listed in the table above.

Hypotheses
To understand the generalizability of Performative Autonomy to other task contexts, and under a wider array of levels of autonomy, we replicate Roy et al. [14]'s hypotheses: H1 In contexts where robots truly do not need assistance with their tasks, Performative Autonomy will increase interactant Situation Awareness and Task Performance.
H2 These benefts will be observed across tasks with diferent baseline levels of Imposed Mental Workload without excessively increasing the perceived Mental Workload.

EXPERIMENTAL DESIGN AND PROCEDURE
To examine these hypotheses in a new task context, we carried out a human-subject experiment where we systematically manipulated two primary independent variables (Performative Autonomy and Baseline levels of Imposed Mental Workload) and then methodically assessed two key dependent variables, Situational Awareness and perceived Mental Workload (SA and perceived MW).
While Roy previously examined Performance of Autonomy online in a simulated NASA environment in which the robot's behavior needed to be monitored, and Silva et al. [19] examined Performance of Autonomy in an in-person collaborative (but not face-toface) context with a social robot, in this work we sought to examine the potential for Performance of Autonomy in an online interaction with a social robot as part of a more actively collaborative task.
In this experiment Fig. 1 Prolifc crowdworkers were engaged in an 8 minute simulated Resource allocation game, similar to that used by Silva et al. [19] but with a higher level of complexity.Participants were asked to fulfll Overcooked-like [3, cp.] recipes involving six diferent categories of items (dowel, screw, wooden board, circuit, wire and tape).At any given time, six recipe cards were visible to participants, each containing a combination of four of the above items.The objective given to participants was to complete recipes by clicking on recipe ingredients, which would expend those ingredients if sufcient resources were available.Once all ingredients within a recipe were expended in this way, that recipe would complete.Once each recipe card was completed, participants were awarded points for completing the recipe, and a new recipe would be added to the set of recipes available for them to pursue.In order to complete the task, it was thus important for participants to monitor the task, and to not waste time trying to expend ingredients that were unavailable.
Meanwhile, new resources were constantly being collected by a simulated robot in a separate window.Whenever the robot was in the process of collecting some item, that item would be highlighted in the resource collection dashboard.Estimates of how many of each resource were available were listed in this dashboard as well.At any time, participants were free to re-task the robot, asking it to switch to collecting a diferent resource.In addition, the robot periodically (every 50 seconds) changed its resource collection strategy on its own, to collect whichever resource was at its lowest level.
Performative Autonomy was manipulated in this experiment by changing the communication strategy used by the robot when changing which item it chose to collect.While [14]'s lowest level of Performative Autonomy examined was the use of a Statement as to what the robot was doing, we instead used an even lower level, of Silent.In this condition, the robot simply did not communicate about the objects it was acquiring.
Our second condition was a Statement strategy, which had represented the lowest level imposed by Roy et al. [14].In this condition, the robot used a statement to inform the human participant about its collection.
Our third condition was a YN Question strategy, which was the medium level of autonomy imposed by Roy et al. [14].In this condition, the robot proposed a decision and asked the user to confrm it.If the user rejected the robot's suggestion on which item to collect next, the robot followed up by asking for clarifcation between which option they would prefer.
Finally, our last condition was a WH-Question strategy, which was the highest level of autonomy imposed by Roy et al. [14].In this condition, the robot presented multiple options and simply asked the user to choose between them without making its own proposal as to which to choose.
Imposed Mental Workload was manipulated by varying the size of the random number in the resource management task.In low workload conditions the participants were presented with a two digit keycode on the left side of the screen at the beginning of the game.After selecting any ingredient, participants were asked to enter that keycode in a box displayed at the bottom of the screen and click on the submit button.If the keycode entered was correct, the ingredient would be expended, if available.Whether correct or incorrect, participants were then assigned a new keycode that would need to be entered on the next attempt to expend any resource.Mental Workload was varied by changing the number of this keycode from two, to four, to six, in the low, medium, and high workload conditions respectively.
Situational Awareness: To measure participants' SA, they were asked, at intervals of 80 seconds, to identify the item with the lowest availability.On each occasion, the screen was blanked, and they were given a forced-choice between all six ingredients.Participants could not resume their task until they answered the SA question because their screens were blocked, preventing visual inspection.
Perceived Mental workload was measured by periodically (every 100 seconds) asking participants to self report their level of perceived Mental Workload on a 1-5 Likert Item.

RESULTS
404 American participants were recruited through Prolifc.160 selfidentifed as male, 195 as female, and 49 otherwise.Participant ages ranged from 18 to 69 (Mean =34.41,SD = 11.88).Each was randomly assigned to one of our four Performative Autonomy conditions and one of our three Imposed Mental Workload conditions.To assess H1 and H2, we performed a Bayesian Analysis of Variance [20] Figure 1: Task Design: The above fgure depicts the task design as seen from both the human and robot perspectives.
of the efect of the Performative Autonomy and Imposed Mental Workload conditions on SA and Perceived Mental workload.

Situational Awareness
Extreme evidence was found against any efect of Communication Strategy, nor of any interaction between Communication Strategy and Imposed Mental Workload.As such, H1 was not supported in terms of SA: in this experimental context, no SA beneft of Performative Autonomy was shown.

Accuracy-based Task Performance
Strong evidence was found against any efect of Communication Strategy on participant accuracy in entering keycodes (BF=0.052).As such, H1 was not supported in terms of Accuracy-based Task Performance: in this experimental context, no Accuracy-based Task Performance beneft of Performative Autonomy was shown.

Communication Reaction time
Finally, we considered the amount of time taken by participants to enter key-codes.Although there was very strong evidence (BF=35.965)that higher levels of Imposed Mental Workload led to slower reaction times, moderate evidence was found against any efect of Communication Strategy (BF=0.119),and very strong evidence was found against any interaction between Communication Strategy and Imposed Mental Workload (BF=0.044).As such, H1 was not supported in terms of Reaction Time based Accuracy: in this experimental context, no beneft of Performative Autonomy was shown.

Perceived mental workload
While strong evidence was found (BF=14.436)that higher levels of imposed workload led to higher levels of perceived workload, strong evidence was found against any efect of Communication Strategy on perceived mental workload (BF=0.068),and moderate evidence was found against any interaction between Communication Strategy and Imposed Mental Workload on Perceived Mental Workload (BF=0.129).As such, Hypothesis 2 was partially supported.Indeed, Performance of Autonomy did not increase Mental Workload; but there were no observed benefts to demonstrate across levels of Mental Workload.

GENERAL DISCUSSION
In our study, we aimed to understand whether Performative Autonomy would lead to provide the benefts shown by Roy et al. [14] and Silva et al. [19] in more complex collaborative tasks, and with a wider range of imposed Levels of Autonomy.However, our fndings provided strong evidence against any such benefts.We believe that these results may be due to several factors.
Most importantly, we believe that the highly complex nature of this experiment, combined with participants' need to accomplish the goal of gathering most items while remembering keycodes, meant that the task was simply too cognitively overloading, both washing out results and discouraging participants from ever checking on resource levels in any condition.
Moreover, the experiment did not account for other external factors like distraction and vigilance failures.This experiment may have simply been too complex to run as an online experiment, and participants may not have devoted their full attention to the task.

CONCLUSION
In this work, we sought to replicate the results of Roy et al. [14] and Silva et al. [19] in a more complex and collaborative environment.Unfortunately, the extent of this complexity appeared to wash out the benefts of Performative Autonomy.Future work is needed to better explore the limits of Performative Autonomy in complex yet manageable task contexts.

Table 1 :
Dialogue Autonomy Levels & associated Speech Acts