An Exploratory Study on People's Intuitive Understanding of Expressive Robot Behavior

Robots are anticipated to communicate with humans in our social context. Using nonverbal communication by robots increases communication comprehension because humans intuitively understand such behavior due to their experience of human-human interaction. Thus, this behavior is suitable for studying what makes robot motion intuitive to understand in human-robot interaction (HRI). We study how robot features help humans intuitively grasp expressive robot gestures, concentrating on what makes behavior easy to interpret. After watching eighteen nine-second videos of three robot kinds demonstrating expressive robot actions, 50 participants completed an open-ended survey. Our findings highlight the inputs, mediating factors, and outputs that users reported based on observing examples of expressive robot behavior. These insights are a starting point for analyzing robot behavior from the perspective of intuition and provide a foundation for a theoretical framework for intuition in HRI.


INTRODUCTION
Robots are slowly becoming part of our social environment as our interaction partners, especially with the arrival of social robots.These types of robots are specifcally designed for social interaction [7] because they can use diferent movements and morphologies to interact with users [15].Using movement and other forms of nonverbal cues such as facial expressions, gaze behavior, and gestures [28,32], robots can communicate information about their intentions to users [28].As a result, it makes the robot's behavior understandable [7,10,38].
Nonverbal cues provide a natural and intuitive means of communication in human-human interaction [17], especially in terms of emotion [23].This notion provides the opportunity to discuss what contributes to an intuitive understanding of the nonverbal cues of a robot from a user's point of view.Studying how users initially understand and interpret motion might provide insights into how the nonverbal behavior of robots can be improved [31].Furthermore, using these insights could aid in developing a framework for intuitive robot thinking using human intuitive understanding as a foundation.
Therefore, in this exploratory study, we address the question, "How do people intuitively understand the expressive nonverbal cues of diferent types of robots?".To answer this question, we conducted a qualitative observation study combined with an open-ended survey.By addressing this research question, we hope to contribute to the feld of HRI by providing a new perspective using intuition to look at users' reactions to robot behavior to evaluate HRI and to provide a foundation for an HRI framework of intuitive robot thinking based on human cognition.

Theoretical Background on Intuition
Making assumptions about how someone is feeling and nearly any other initial observation we make about something can be viewed as intuitive perception [8].Even though there is no consensus about the defnition of intuition [5,11,42], we will look at intuition from a cognitive science perspective for this project's scope.This perspective regards intuition as a decision-making process that occurs on a non-conscious level [1].Intuitive decisions revolve around the unconscious comparison between patterns in the current situation and memorized patterns from previous experiences.This input generates a response based on successes in previous encounters adapted to the current situation [24].These patterns are often captured in heuristics to reduce the mental efort of decision-making [1,5,8].
Furthermore, emotions play a crucial role in this process [8].First, emotions linked to memories of situations might trigger reactions that guide decision-making [4].Secondly, when making intuitive decisions, regions in the brain are related to emotion and social interaction [24].Even though this process happens non-consciously, it can become conscious to someone as a gut feeling [6].Lastly, emotions provide a holistic way of processing information, creating a direct understanding of a situation [34].
The Dual Processing Theory by Kahneman [8] is a theory about intuitive decision-making.This theory posits two mental processes.Intuitive thinking, which is automatic and fast and relies on emotions and impressions, is prone to errors and bias.It is referred to in Kahneman's model [8] of thoughts as System 1, while rational thinking is called System 2 and is relatively slow, focused, and effortful.System 1 is active in everyday behaviors, whereas System 2 attempts to suppress impulses from System 1, if these are inappropriate [8].More scholars propose a dual process model [25], and some oppose this [11].

RELATED WORK
Studies applying the Dual Process Theory in HRI suggested its relevance for analyzing human perception in HRI [20].In this analysis, the focus should be on the robots' surface features and human intuition.Jahn & Nissen [16] explored System 1 and System 2's role in anthropomorphizing robots, measuring humanness, trust, and warmth, and learning how both systems mentally categorize robots diferently.Work of Spatola & Chaminade [36] incorporated anthropomorphism into cognitive control theory.Their work uses the dual processing theory as a foundation and adds the layers of social cognition (System 1) and physical cognition (System 2).Furthermore, Smedegaard [35] provides a perspective on the novelty efect of social interactions with robots.Stating there is a knowledge gap between what users know about socially interacting with humans that is dissimilar to interaction with robots due to lack of experience.This notion relates to intuition because that is what builds on experience [11].Thus, we must understand how HRI intuition works to understand this gap.
Lastly, studies have been done on the infuence of embodiment on the perception of users using facial expressions [3].However, in this work, we focus on whole-body movement.

METHOD
Through a social media platform, we recruited 50 participants (14 men, average age = 28.4,SD = 6.4; 36 women, average age = 25.2,SD = 3.49).They were selected based on their ability to read and write English and their lack of prior experience interacting with robots.
Before the experiment, participants received information about the study and signed informed consent.Participants were free to withdraw from the study without providing an explanation or expecting consequences.Upon completing this survey, all received a compensation voucher of about 10 euros.The study was conducted according to the ethical guidelines from the Norwegian Center for Research Data (NSD) (Ref.Nr. 863469).The data were collected on a dedicated computer and stored on a Service of Sensitive Data (TSD) (Ref.Nr. p2017), owned by the University of Oslo, Norway.
During the study, participants watched eighteen unique videos of three robots performing expressions of happiness and sadness from three diferent angles.These two emotions were selected because they are most applied in HRI research [2,23,38,41] and are most easily recognized by humans [29,30].The robots used in this study were PLEO, an animal-like robot [37]; Pepper, a humanoid robot [27]; and TIAGo, a servant robot [26] (see Fig. 1).They were selected based on diversity in design, capability of whole-body movement, similar expressive body parts (head and either arms or legs), availability, and preprogrammed expressive behaviors.Combining LABAN principles and earlier research on expressive robot behavior design [2,38,41], it was possible to classify the preprogrammed expressive robot behavior by the researchers.A panel of six judges evaluated the selected behaviors with a two-forced choice of happy or sad to validate whether the robots' behaviors were perceived as intended.Only when all six judges unanimously agreed on the classifcation were the videos included in the study.
The videos of these robots were made against a gray background.Furthermore, they were captured from three viewpoints to minimize familiarity with the content and avoid similarity while providing more information about the robots' expressive motions [18].Every participant saw all eighteen trials in a pseudorandomized order without audio to prevent potential bias.The experiments were designed to be self-paced and held within the Department of Psychology's pupillometry and eye-tracking laboratory at the university campus.
After viewing all the video clips, participants were tasked with completing a survey containing open-ended questions related to how the user felt regarding the ease of understanding the robot, what features infuenced that ease and decision-making, and how they felt about the potential ease of interaction.The entire experiment lasted 30 to 45 minutes.
The data were analyzed using a thematic analysis with an inductive approach.Based on the analysis, we identifed three main themes related to the intuitive impressions and experiences of the participants while observing the robots.The frst theme is 'Inputs from observing expressive robots', which consists of the robot's design and Motion.The second theme is 'Mediating factors for interpreting robots expression', consisting of Familiarity with robots and Other mediating features.The last theme we identifed is 'Intuitive outputs from observing the expressive robots'.This theme consists of Ease of comprehension of the robot's motion, Evoked feelings and impressions, and Criteria for robots to be good interaction partners.

FINDINGS 4.1 Inputs from observing expressive robots
The intuitive inputs that the participants got when observing the diferent robots were a) the design of the robots and b) their motion.The robot's design comprises its overall appearance and expressive features, consisting of head and facial features (N = 47) and limbs (e.g., arms, feet, legs, and tail) (N = 40).A combination of motion and expressive features provides fundamental information for users to use in their robot behavior predictions.Participant 15 [P15] said, "Usually limbs and eyes, or whatever is moving"."Facial expressions with eyes, rapid movement of the limbs and the tail to a certain extent" [P28]."The facial expressions are the most easily targeted-their eye, mouth, etc.The hands/feet movement also helped me guess to what extent or how strong the robot is expressing their emotions" [P33].Five participants argued that the robot's motion is more important as an input than the robot's appearance."Rather than appearance, the number of parts or the robot that could show emotion with the movements and gestures made me choose, but the appearance also helped" [P28].Other motion features that were considered input are the motion speed (N = 6) and its intensity (N = 1).

Mediating factors for interpreting robots expression
Factors mediating the interpretation of a robot's behavior are familiarity as well as other factors.Familiarity is a construct of resemblance and knowledge & experience.The robot's design evokes the resemblance.Its appearance can trigger a resemblance to another concept, e.g., a human or animal, that the user has seen before.PLEO was often associated with an animal, Pepper with a human, while TIAGo lacked an unmistakable resemblance, being described as a one-armed robot or even as a cofee machine.These associations can guide the user's interpretation of the robot's behavior.Thus, it makes it easier or efortless for users to interpret the behavior or gestures of a robot."The robot's appearance mimics that of humans and animals, so I tend to interpret their behavior based on my understanding of human and animal behavior" Furthermore, knowledge and experience are part of familiarity.Knowledge is accumulated through the cumulative experience users have with robots.As the participants gained more experience from observing the robots throughout the experiment, they mentioned that it became easier to understand them."Sometimes it is hard to know what the robot was showing, but after seeing it again, it got easier" [P37].They argued that they became more familiar with the robot's movement, identifying happy versus sad emotions."I got more familiar with the three types of robots and how they move" [P14]."As I saw more samples, I got an idea of emotion to choose more precisely" [P40].Based on this experience, two participants even mentioned that they started focusing on diferent parts of the robots when comparing the beginning to the end of the experiment.
Participants also mentioned several other features that mediated an interpretation of the robot's behavior.These features are the observation angle (N = 6), the amount of motion (N = 9), and complications due to the setup of the experiment (e.g., the repeated measures (N = 16) and the time constraints on observing the robots (N = 1)."Diferent angles gave me diferent information, which I used to decide the emotion I perceived." [P42].

Intuitive outputs from observing the expressive robots
The output from the participant's observation resulted in increasing ease in the comprehension of the robot's movement, the robot evoking feelings and impressions with the user, and some recommendations for making robots better interaction partners.Participants stated that PLEO (a dinosaur robot) was the secondeasiest to comprehend (N = 11) after Pepper (a humanoid robot).Instead, TIAGo (servant robot) was the most difcult to understand (N = 18)."The humanoid robot was easier to read, and the dino was cute.The one-armed felt the least readable to me." [P6].This comprehension was achieved based on a combination of input and mediating factors."The more humanoid was easier to understand than the 'robotic' robot.The movements were obviously mechanical, making it harder than a human." [P6].
Furthermore, the robots evoked positive (N = 17) and negative (N = 19) impressions on the participants.The positive impressions mostly came from the humanoid and dino (N = 13) being friendly, cute, and expressive.Three participants claimed that the friendly impression of the robots guided their judgment of them.Conversely, the negative impressions mostly came from the servant robot because it was perceived as bored, angry, or unalive (N = 10)."The one-armed robot seemed aggressive" [P32].Confusion among the participants was also mentioned three times due to the incongruency between facial expression and body movement.
Lastly, participants provided recommendations for smooth interactions between humans and robots.They mentioned that robots should use multiple features to communicate with users (N = 6) (e.g., verbal and nonverbal communication), be able to convey and understand emotions with a user (N = 4), and move fuently and less mechanically for better readability of their motions (N = 4).

DISCUSSION
Gaining an understanding of intuition in HRI will aid in understanding behavior and interpretation in HRI interactions better and, as a result, be able to create more efcient robots.Several studies have examined the design of robot behavior and robot cognition [7,33].However, these studies did not explicitly consider the users' intuitive thought process as a foundation for developing robot behaviors and their cognition.Therefore, this paper addresses the research question, "How do people intuitively understand the nonverbal communication of diferent types of robots?".In an attempt to answer the question, participants were shown videos of three diferent robots displaying expressive behaviors.After this task, they completed an open-ended survey about factors contributing to their easy understanding of the robots.
Our fndings suggest that three elements constitute the intuitive thinking process of users when observing the expressive motions of diferent robots.The process consists of inputs, mediating factors, and outputs.The inputs consist of the design of the robot, including its features (head/ face, limbs (arms, legs, feet, and tail), and the robot's motion.Mediating factors consist of familiarity and other factors.Familiarity refers to the evocation of associations with other animate objects, such as humans or animals, based on the design of the robot and the knowledge & experience a user has with similar animated objects to use as a foundation for aiding in their interpretation.The outputs refer to the ease of comprehension of the robot, the positive or negative impressions and emotions the robot made on the user, and the criteria for human-robot interaction.
As the theoretical framing mentions, we adopt a cognitive science perspective on intuition.In this feld, intuition is a decisionmaking process at a non-conscious level [1] where comparisons are made between memorized and current patterns to interpret a situation [24].Considering our fndings and theory, the design and movement of the robot are essential inputs for the intuitive interpretation of the user.On the one hand, the robot's appearance infuences the expectations that people have [13] and how we feel about them [9].Some would argue for moderate levels of anthropomorphism in appearance with social cues for improving interaction and acceptance [19,39].Since humans are already hardwired to pay attention to human likeness, this incorporation should be easy [31].However, designers and developers should be mindful of a high level of anthropomorphism because it makes users uncomfortable [13].
Another essential element of our fndings is the concept of familiarity, which our fndings suggest is a construct about the robot's resemblance to a similar animate agent and the knowledge and experience with those similar objects.Whittlesea [43] defned familiarity as the perception of a stimulus linked with the recollection of prior experiences.For example, the more a user interacts with a robot over time, the more familiar it becomes [14].Work by Van de Walle et al. [40] defnes familiarity as understanding and getting involved with technology, unconsciousness, easiness, comfort, and friendliness.Other works on familiarity also emphasize the link between familiarity and the robot's appearance.According to research by Saunderson and Nejat [32], exposure to stimuli similar to those the user has already encountered triggers familiarity.They also argued that familiarity has two components.1) memorable exposure (= the amount of time spent with the robot) and 2) human likeness (= referring to the infuence of appearance).Furthermore, they argue that familiarity increases the liking of something.Additionally, other studies [9,22] emphasize that a notion of familiarity with a robot can increase its acceptance.However, this human likeness should be fne with humans' distinctiveness because that could lead to resistance to use and, therefore, decrease the acceptance of the robots.

Limitations of the study
This study has several limitations.Firstly, the imbalance in gender representation, with more females than males, might have impacted results, potentially favoring female participants' attention to emotions [12].Second, Mara et al. cite [21] advocate using video materials in a controlled lab setting rather than actual interactions with a robot, which could introduce biases.However, their study focused solely on humanoid robots.Thirdly, the study's restriction to only three types of robots might limit broader insights from existing robot varieties.Fourthly, using open-ended questions at the end of the video observations could make distinguishing between intuitive and rational thinking difcult.Lastly, survey questions might only partially encompass intuitive responses since they only focus on what makes a robot easy to understand.However, intuition is comprised of several elements.

CONCLUSION AND FUTURE WORK
In this study, we investigated how people intuitively perceive the expressive nonverbal communication of social robots to understand users' intuitive impressions of them.Our fndings suggest that diferent inputs, mediating factors, and outputs are part of this intuitive decision process.The inputs are the design and motion of the robot; the mediating factors are familiarity and other factors; and the outputs are ease of comprehension of the robot's motion, the impression the robot makes on the user, and suggestions for improvement in robotic behavior.With this work, we contribute to the feld of HRI by identifying factors that aid in understanding robotic movement from an intuitive point of view.
Future research could use a diferent method to measure the intuitive understanding of users by applying diferent measurement tools for measuring cognitive activity, e.g., pupillometry, accuracy measurements, measurements of time, or specialized surveys such as NASA TLX.Furthermore, it would be possible to research the intuitive understanding of the motion of a care robot.Such a study could explore the balance between emphasizing the efciency and quality of service delivery versus the nonverbal communication of the robot during task execution in a real-world interaction.For instance, examine the relevance of nonverbal communication within such contexts and how to shape these nonverbal behaviors accordingly.

ACKNOWLEDGMENTS
Firstly, the authors thank those who participated in the study.They want to acknowledge the University of Technology Eindhoven which provided robots used in the research.Furthermore, they appreciate the contributions of the research assistant, Shoaib Nabil, who assisted with data collection and part of the experimental setup.Lastly, they thank The Research Council of Norway (RCN) for their support as part of the Predictive and Intuitive Robot Companion (PIRC) project under grant agreement no.312333, Vulnerability in Robot Society (VIROS) project under grant agreement no.288285, and through its Centres of Excellence scheme, RITMO with project no.262762.

Figure 1 :
Figure 1: The diferent robots used in the study were presented from diferent angles (from top: front, side, and 3/4angle).From left to right: Pepper, PLEO, and TIAGo [P10]."Interpreting the more human-like robot and animal is easier because those are what we encounter daily" [P16].