Designing Interactive Agents to Support Emotion Regulation in the Workplace through Guided Art-Making

Workplaces are high-pressure environments where employees often deal with inflexible deadlines, instability in work relationships due to conflict, and the expectations of deliverables -- factors that exacerbate occupational stress and anxiety. While studies have demonstrated the effectiveness of therapeutic art-making interventions for supporting emotion regulation and alleviating occupational stress, there are few deliberate opportunities for employees to regulate their emotional state within the workplace. In this work, we present the design of a voice agent that can guide a user through a therapeutic art-making intervention to promote emotion regulation within the workplace. We share preliminary insights regarding the design of our voice agent, including the importance of embodiment and personalization. We also share insights about the feasibility of our proposed user study, which is aimed at evaluating the effectiveness of our voice agent at promoting emotion regulation in employees through therapeutic art-making.

(iv) Agent asks the user to report their starting emotional state as grounds to select a therapeutic-art making activity.(v) User follows the guidance of the voice agent in performing the therapeutic art-making activity on a tablet.(vi) Agent concludes session by asking user to refect on their experience and report their fnal emotional state.(vii) Agent concludes the session.

ABSTRACT
Workplaces are high-pressure environments where employees often deal with infexible deadlines, instability in work relationships due to confict, and the expectations of deliverables -factors that exacerbate occupational stress and anxiety.While studies have demonstrated the efectiveness of therapeutic art-making interventions for supporting emotion regulation and alleviating occupational stress, there are few deliberate opportunities for employees to regulate their emotional state within the workplace.In this work, we present the design of a voice agent that can guide a user through a therapeutic art-making intervention to promote emotion regulation within the workplace.We share preliminary insights regarding the design of our voice agent, including the importance of embodiment and personalization.We also share insights about the feasibility of our proposed user study, which is aimed at evaluating the efectiveness of our voice agent at promoting emotion regulation in employees through therapeutic art-making.

INTRODUCTION
Workplaces are high-pressure environments.Employees often deal with infexible deadlines, instability in work relationships due to confict, and the expectations of deliverables.These factors exacerbate stress and anxiety and contribute to burnout [46].In an attempt to reduce these negative outcomes, employers are becoming increasingly amenable to supporting the holistic (i.e., mental, psychological, and physical) well-being of their employees by encouraging physical activity [16,61] through installing ofce gyms, providing on-site psychotherapy opportunities [66], and encouraging employees to take breaks [28,62].Despite these incremental steps, there is still a lack of deliberate avenues for employees to regulate emotions at work.
Emotion regulation is the process by which people can impact the emotions they choose to express and how they experience these emotions [29].Difculties in regulating emotions have been associated with increased emotional exhaustion within the workplace, general fatigue, and negative afect at home [8,64,65].Art therapy, a type of psychotherapy that uses creative expression to promote mental well-being and facilitate emotion regulation, has been used to manage occupational stress and anxiety [1,20].Prior work has also explored the use of intelligent agents and interfaces to support art therapists in treating patients with cognitive impairments and promote emotional well-being through guided art-making [23].Despite the myriad benefts of art therapy, the role of intelligent agents in supporting guided art-making as a therapeutic intervention within the workplace is still under-explored.Furthermore, prior work has demonstrated that digital technology can promote artistic freedom of expression as people are not constrained by physical art materials and can use diferent modalities, and digital environments are mess-free [69].Thus, some art therapists incorporate digital media into their practices and encourage their clients to follow along, creating art using a tablet, electronic pencil, or other tools [15,41].Motivated by these fndings, we incorporate the use of art-making using digital media, specifcally a tablet, as these interfaces can enable a discreet, technology-enabled avenue for art-making within the workplace.
In this work, we detail the design of a virtual agent, in the form of a voice user interface (VUI), that can guide employees through a therapeutic art-making intervention.We designed a user study using Wizard-of-Oz (WoZ) methodology to enable users to engage in a therapeutic art-making intervention led by a prototype of our virtual agent.We used self-report and physiological measures to infer whether an intelligent agent can facilitate emotion regulation after a stressful event.To assess the feasibility of our study design, we conducted a pilot study.The contribution of this work is the design of an agent that supports employees' emotion regulation through guided art-making as a therapeutic intervention and preliminary insights into our proposed evaluation study.

RELATED WORK
Our related work spans fndings across the domains of emotion regulation in the workplace, therapeutic art-making, and intelligent agents used to support positive mental health outcomes.

Emotion Regulation in the Workplace
Emotion regulation, a person's ability to modify the emotions they feel or express, is critical for people in the workforce.Studies have demonstrated that a person's ability to regulate their emotions in the workplace is associated with higher job satisfaction and happiness [49], and improved team-member exchanges [34].Emotion regulation strategies (ERS) are processes that dictate how a person regulates their own emotions [29,31,55].Gross's pivotal framework on ERS [29,30] includes strategies like seeking situations or people that make you feel good; engaging in distraction; or re-framing one's perception of a situation, to name a few.Diefendorf et al. [21] leveraged Gross's framework to classify ERS used by employees.They found that the most commonly used ERS amongst employees falls into Gross's categories of situation selection (e.g., seeking out people who make you feel good) followed by attentional deployment, including distraction and positive re-focus.While these strategies are informally practiced by employees to address their occupational stress, there is a dearth of avenues for employees to deliberately utilize ERS that support their mental well-being.
Art therapy and therapeutic art-making are efective means for facilitating emotion regulation [19,22].Past work has also demonstrated that art therapy can minimize symptoms of depression, anxiety and stress [31].A study by Drake et al. [22] suggests that the positive mood-related efects of art-making are stronger when art is used as a distraction.Similarly, Dalebroux et al. [19] found a strong increase in mood when people were encouraged to focus on positive emotions while creating visual art.Prior work has also demonstrated the efectiveness of therapeutic art-making interventions for treating occupational stress [39,63].A recent review paper found 11 articles that have evaluated the role of art therapy in managing workplace-related stress and anxiety, most commonly for employees who work in medical settings, including nurses, clinicians, and social workers [39].

Intelligent Agents Supporting Mental
Well-being Intelligent agents have been used to enhance mental health outcomes.Prior work has demonstrated the benefts of socially-assistive robots for people with mental health and neuro-developmental disorders [40].Other work has demonstrated that robots can leverage tactile sensing to adapt the amount of support to provide patients with post-traumatic stress disorders during therapy sessions [7].
Recent work has also explored the use of virtual and embodied agents to support mindfulness.Shi et al. conducted a multi-phase study aimed at understanding the impacts of user-personalization and embodiment on perceived quality of a mindfulness-based agent [59].They explored the use of no agent, a conversational agent, and a socially-assistive robot.They found that user-personalized characteristics of text-to-speech (TTS) voices are perceived almost similarly to human voices.While prior work suggests the benefts of embodied agents, Shi et al. hypothesized that mechanical noises emanating from the embodied robot distracted their participants from the mindfulness exercise.For this reason, we chose to frst explore whether a virtual agent could support therapeutic art-making without the distractions of a physical system.
Recent research has focused on the role of intelligent systems in supporting a person's cognitive [35] and mental [17] well-being by facilitating their creative expression.Several studies have focused on the role of virtual and embodied agents in collaboratively creating artwork with humans on their psychological and mental health [10,17,41].While prior work highlights the efectiveness of therapeutic art-making in the workplace (Section 2.1), as well as the utility of intelligent systems in enhancing mental health outcomes for people using agent-facilitated art-making (Section 2.2), there is a gap in research that demonstrates the use of these agents in providing therapeutic art-making within the workplace.Based on prior work, we designed a VUI that guides users through an adapted form of the well-documented art therapy intervention, the "Scribble Technique" [12,36,67].This activity ofers a nonthreatening means of expressing oneself creatively, bypassing a person's normal resistance to art-making [36].In this adaptation of the Scribble Technique, users were encouraged to focus on the process of art-making: their body movements (distraction) as well as the motions that elicited positive emotions for them.

DESIGN RATIONALE
Voice characteristics impact the quality of human-agent interaction.Therefore, we have designed our voice agent based on prior fndings to enhance the user's therapeutic art-making experience.In this section, we discuss the design rationale for a voice-based agent used for guided art-making as a therapeutic intervention.
(1) Voice versus Text: Prior work has suggested a preference for voice-based agents over text-based agents within vulnerable contexts, which include therapeutic interventions.Yu et al. [68] found that people are more likely to skip invasive questions when they have to read and respond by typing than when they simply have to listen and respond by speaking.Nass et al. [51] also found that voice-based agents are easier to use in situations where users cannot type messages themselves or cannot read the agent's written answers.In our study, we encourage participants to interact with the agent in the process of therapeutic interventions.Therefore, we choose voice-based agents over their text-based counterparts.
(2) Human versus Synthetic: Prior fndings have suggested that human voices were preferred in terms of likability and trustworthiness compared to synthetic voices.However, in recent research, Abdulrahaman suggested that human and synthetic voices are equally efective in reducing feelings of stress [2].Furthermore, Chérif and Lemoine found that human voices had no discernible efect on perceived trustworthiness compared to synthetic voices [13].When comparing modern TTS systems to older counterparts, the modern systems were found to be superior in terms of perceived credibility and engagement [18].These fndings support our decision to use a high-quality modern TTS system for our voice agent.
(3) Gender and Expressiveness: In-depth investigations into the gender and expressiveness of voice agents have been conducted [14,58].While there have been conficting evidence about user preferences related to the perceived gender of voices [26,44,50,60], prior work supports the use of feminine voices within healthcare settings [60].Tay et al. [60] found that people expressed a preference for healthcare robots with a feminine, extroverted voice, leading to more positive afect evaluations and slightly greater acceptance compared to male healthcare robots.Goodman et al. [26] also found that in healthcare settings, agents perceived as female were viewed as more trustworthy.Therefore, we used a feminine, extroverted voice to encourage a better user experience in a therapeutic environment.
(4) Visual Animations: While designing the voice agent, it is essential to visualize the agent's utterances to enhance interaction and user experience [5,42].However, Parmar et al. [53] discovered that during a brief health counseling task, human-like animation may be distracting from the persuasive message, with the maximum levels of persuasion found when the quantity of agent animation is reduced.Thus, this work encouraged us to develop a VUI that visualized spoken utterances rather than a human-like agent.For the motion visuals of the agent's utterance, we opted and followed Google's material design guidelines [27].

Prototype Implementation
We designed an agent motivated by the design rationale outlined in Section 3. Our VUI wireframe was created using the interface design application, Figma.We recorded the speech from IBM Watson's TTS software and selected the expressive feminine voice of "Emma".We used the motion graphics software, Adobe After Efects, to synthesize visual animations of the agent's speech and listening behaviors.The Fig. 2-(i) represents the agent's speech.The line waves refect the volume and fuctuation of the voice.Fig. 2-(ii) depicts the agent's listening.The paused line with gradient color indicates a waiting state.Participants could also interface with the prototype through the GUI, for example to select an icon (see Fig. 3 steps 2 and 10).We outline a summary of the behaviors of the voice agent in Figure 3.When developing the behavior of our agent, we consulted literature concerning art therapy [45,47,48] and therapeutic art-making as well as a licensed art therapist.Conversations with this therapist elicited questions about the ethical ramifcations of suggesting that an intelligent agent would be solely capable of providing the mental health support on par with a human therapist.Thus, in addition to informing users about how to interact with the voice agent, we also informed them to consult a licensed mental health provider should any negative emotions arise during sessions with the agent (this is done at steps 1 and 10 in Fig. 3).

Proposed Study Design
Prior work has suggested that therapeutic art-making is associated with increased positive valence after the intervention [19].Our proposed user study aims at investigating the following research hypothesis: The voice agent-led art therapy activity will lead to an increase in positive valence compared to the self-guided approach.
To assess our hypothesis, we designed a between-subjects user study.In this study, we measure participants' reactions to a stressful task [37,38].Participants are randomly assigned to either the voice agent-guided therapeutic art-making intervention (experimental) or the self-guided art-making (control) condition prior to the start of the experiment.To assess the feasibility of our proposed study design, we conducted a pilot study with the authors of this work.
(1) Study Procedure: Upon arrival, the experimenters informed participants of the purpose and structure of the study.The study was recorded through the Zoom videoconferencing software.Participants were instructed to sit in front of a monitor that displayed a secure survey form.At the beginning of the study, we collected participants' baseline emotional state.We aim to infer participants' afective state through self-report and physiological measures.For the duration of the study, the experimenters left the room to avoid biasing the participants' responses.The proposed study is divided into 3 phases.In the frst phase, participants provided self-report afect measurements and flled out their demographic information.After this was complete, they informed the experimenters, who then informed them to continue onto the second phase.In the second phase, participants completed the Cognitive Emotion Regulation Questionnaire (CERQ) [25], which is used to provide insight into the ERS our participants' use when faced with a negative event.
Participants then moved into a stress task adapted from [37].In this task, participants were told that previous participants completed a long form in less than 2 minutes to create time pressure.Unbeknownst to them, participants were given less than 2 minutes before the form automatically advanced.They were then asked for feedback as to why they did not complete the form in 2 minutes.They were also asked to provide post-stress task, self-report afect measurements and instructed to call the experimenters back into the room to administer an intervention activity.For our experimental condition, we shared our implemented prototype using "screen share" feature to the desktop screen of the participant.We enabled the "remote control" feature on Zoom to allow the participant to select the screen when necessary (for example, in Step 2 of Fig. 3).The "Wizard", who was the same researcher for the duration of the study, was equipped with a script to ensure the interface followed the same set of behaviors across all participants.In the control condition, participants were given a set of instructions that followed the same set of behaviors as the script used to control the WoZ prototype.In the fnal phase, participants flled out the Self-Expression and Emotion Regulation in Art Therapy Scale (SER-ATS) Questionnaire [32] to evaluate the efect of the art-making intervention on their emotion regulation.The researchers returned to the room to conduct the post-study, semi-structured interview and fully debrief participants.This proposed study took approximately 40 minutes.In future work, participants would be provided a participant ID.This participant ID would be used to randomly assign participants to our research conditions.We will recruit an even number of participants to ensure they are balanced between the experimental and control conditions.
(2) Measures: (A) Quantitative Data: Participants report their emotional state using the Afective Slider [6], an empirically validated digital tool used to measure 'pleasure' (valence) and arousal.For physiological measures, prior work demonstrates the use of electrodermal activity (EDA) and heart rate variability (HRV) to measure stress and anxious responses to a stimulus [9,24,43,54,56].Thus, we collected EDA and HRV through the Galvanic Skin Response (GSR) and Apple Watch sensors, respectively, for post-hoc afect analysis.In future work, we will normalize the afect measures to account for variability in baselines across participants [43].(B) Qualitative Data: We elicited participant feedback through a semi-structured interview to understand their subjective user experiences and feedback.

PRELIMINARY INSIGHTS AND FUTURE DEVELOPMENT 4.1 Insights on Future Agent Design
Prior work suggests that user-personalization of voice agents for therapy is preferred [59].Since we only tested one voice agent with set characteristics, that would impact the user experience.In the future, we would like to enable users to have more control over the characteristics of the voice agent and employ these features in our prototype similar to prior work [57].Additionally, we would like develop an interface that provides personalized activities based on the user's emotional state [35].Furthermore, during the pilot study, when interacting with the agent, one author expressed uncertainty about reproducing the agent's commands without visual demonstration as well as concerns about adherence to commands.This motivates our desire to evaluate the role of an embodied agent on adherence [4,11], particularly for mental health activities [52] and on providing demonstration to better facilitate guided art-making as a therapeutic intervention.

Challenges and Insights on Study Design
Our aim was to elicit negative emotions through our stressful task.Our procedure were motivated by prior work [37,38], evokes moral and ethical questions about emotion elicitation through deception [3,33].In future work, we aim to explore emotion elicitation methods that reduce potential distress.Furthermore, the agent-guided condition was time-limited compared to our self-guided condition, in which participants were be able to self-pace.For our future study, we aim to incorporate time-limited guidance into self-guided instructions to reduce variability between our study conditions.

Figure 1 :
Figure 1: An interaction between a user and voice agent informed by our design.(i) User is stressed by a work-related task.(ii) User opens up voice user interface for therapeutic art-making.(iii) Agent explains how a user should interact with the system.(iv) Agent asks the user to report their starting emotional state as grounds to select a therapeutic-art making activity.(v) User follows the guidance of the voice agent in performing the therapeutic art-making activity on a tablet.(vi) Agent concludes session by asking user to refect on their experience and report their fnal emotional state.(vii) Agent concludes the session.

Figure 2 :
Figure 2: The visual animations of the agent while (i) speaking and (ii) listening.

Figure 3 :
Figure 3: Steps demonstrating the voice agent's behavior along with text adapted from speech at each step.Step 1: Introduction and providing instructions on how users can interact with VUI; Step 2: Asking the user to express their feelings; Step 3: Giving instructions on therapeutic art-making activity; Step 4: Guiding the user through a warm-up scribble activity; Step 5: Guiding the user through drawing on an iPad; Step 6: Encouraging scribbles; Step 7: Encouraging scribbling until user feels fnished; Step 8: Encouraging user to review their scribbles; Step 9: Encouraging the user to verbalize their refection; Step 10: Asking how user feels after the activity; Step 11: Concluding the session.