Unlocking Human-Robot Dynamics: Introducing SenseCobot, a Novel Multimodal Dataset on Industry 4.0

In the era of Industry 4.0, the importance of human-robot collaboration (HRC) in the advancement of modern manufacturing and automation is paramount. Understanding the intricate physiological responses of the operator when they interact with a cobot is essential, especially during programming tasks. To this aim, wearable sensors have become vital for real-time monitoring of worker well-being, stress, and cognitive load. This article presents an innovative dataset (SenseCobot) of physiological signals recorded during several collaborative robotics programming tasks. This dataset includes various measures like ElectroCardioGram (ECG), Galvanic Skin Response (GSR), ElectroDermal Activity (EDA), body temperature, accelerometer, ElectroEncephaloGram (EEG), Blood Volume Pulse (BVP), emotions and subjective responses from NASA-TLX questionnaires for a total of 21 participants. By sharing dataset details, collection methods, and task designs, this article aims to drive research in HRC advancing understanding of the User eXperience (UX) and fostering efficient, intuitive robotic systems. This could promote safer and more productive HRC amid technological shifts and help decipher intricate physiological signals in different scenarios.


STUDY OVERVIEW
Over the past decade, collaborative robots, also known as cobot [1] have gained extensive acceptance in industrial settings.HRC involves situations where humans and cobots work together to perform tasks [2]; for this purpose, cobots are designed for direct interaction with humans in shared spaces or proximity situations.Unlike industrial robots which are typically isolated from human contact, cobots prioritize safety through features like lightweight materials, rounded edges, speed and force limitations, and the use of sensors and software to guarantee safe operation.Cobots are generally considered to offer a higher level of intuitiveness and ease of programming and the best synergy with humans is achieved when they collectively address complex issues in a dynamic, observable, and predictable work environment.For task execution, cobots need to be programmed by the user: procedures and movements can be defined in different ways, such as using a teach pendant (i.e., a programmable interface), free drive mode, and computer vision based on deep learning algorithms [3].The programming phase represents a central part of HRC and for this reason, it should be as intuitive as possible.However, initial programming can be for those unfamiliar with these technologies and could lead to stressful situations that can compromise a good collaboration.For this reason, it is important to monitor the user's stress level, cognitive load, and effort during this task, in such a way that the programming phase is made as simple and intuitive as possible [4].The evaluation and prediction of these states can help make changes and adapt the platform or productive process in real-time to make it more sustainable and closer to the well-being of the user.To date, for monitoring working well-being, stress level, and cognitive workload in real-time, different types of wearable sensors can be used [5] and extract interesting metrics as an interpretation of physical and mental load [6].Previous studies have focused on the generation of datasets of collected physiological signals for the assessment of stress in occupational settings.For example, WESAD dataset [7] includes physiological and motion signals (e.g., electrodermal activity and blood volume pulse) collected from volunteers in office working contexts; and [8] emphasizes the importance of cobot adaptation to the psychophysical state of the user, monitored in real-time via electroencephalography.Due to the complexity of human psychophysiological signals needed for reliable evaluation of stress level and cognitive load, a collection of multiple biological signals is required.To the best of our knowledge, there are no such datasets in HRC contexts, and this work proposes SenseCobot dataset to fill in this lack.The latter is a multimodal dataset of physiological signals and subjective evaluation collected from participants involved in HRC tasks.They have been collected in real-time with wearable and minimally invasive sensors and questionnaires, applicable also in real contexts.The purpose of this work was to offer high-quality data obtained in multiple modalities for understanding different psychophysiological states related to mental effort and stress evaluation in cobot programming.

METHODS
The experiment has been structured into three phases, described in detail below: introduction to the learning materials, baseline measurement task, and hands-on practice.The learning platform has been organized in such a way as to provide information gradually as needed for the execution of task phases to make cobot programming as easy as possible for novices.It includes expository slides, video, and audio.The hypothesis has been that if unskilled workers are better informed about collaborative robots and receive systematic training, they could gain the ability to independently program effectively these devices.
2.1 Experiment Phases and Tasks.The experiment was organized in such a way that the user follows the cobot programming instructions shown on a monitor, first in a trial phase and subsequently in an actual simulation.To make the procedure standardized and reproducible, the experimental setup has been structured to include the following three macro-phases: Introduction to the Learning Materials -A set of slides, with instructions regarding the cobot programming procedure, have been provided.Participants had to execute the first four tasks which will be presented again in the Hands-on practice phase, with no specified time constraints.They could revisit the instructions as needed, no errors were formally monitored, and no subjective questionnaires have been administered.Baseline Measurement Task -Participants have been accommodated and dressed in the sensors, while 3 minutes of neutral screen, followed by eight videos have been shown.The videos were carefully chosen to evoke emotions from the platforms OpenLav [9] and Moody Digital [10], mirroring those that the operator might experience during task execution.The baseline has the function of objectifying and standardizing the psychophysical conditions of each participant and establishing a reference against which it is possible to compare the signals acquired during other tasks.
Hand-On Practice -Participants have been required to complete 5 tasks, whose first 4 already faced in the learning phase and organized in a progression of complexity to mimic the difficulties that might be encountered in a real working context: • Task 1.It has been asked to move the cobot arm between two designated points (A and B).
• Task 2. It has been asked to maneuver the cobot arm among three designated points (A, B, and C).
• Task 3. It has been asked to maneuver the cobot's arm to an arbitrary height from point A (A'), and then gradually move towards point A while maintaining its perpendicular orientation to the work surface.After a short pause, the cobot should ascend again to point A'.
• Task 4. The objective was to pick up a screw located at point A on the work surface and perform a placement action.Using the cobot's gripper, it has been required to grasp the screw, elevate it, and subsequently deposit it onto point B.
• Task 5.The last task encompasses all the previously acquired commands.It has been asked to manipulate a box, initially oriented at a 45° angle to the surface, and precisely siting it on a predetermined spot of the plane.Following this, it is required to elevate the robotic arm slightly, close the grip, and create a continuous boundary at a consistent pace.In each task the user was free to use teach pendant or free drive mode: this information was recorded, as it could be useful to evaluate whether the perceived stress varies.To collect a subjective perception of the stress level, at the end of each task, participants answered two NASA-TLX [11] questions: one about physical demand and the other one on general effort, while at the end of the Hands-on practice phase, the participants filled in an extended version of the NASA-TLX.

2.2
Equipment and Sensors Used.The workstation arrangement was settled as shown in Fig. 1 and as described below: the cobot UR10e model with six degrees of freedom has been positioned in front of the participant, while a touch screen monitor with explanation slides has been placed on the participant's left side.Various sensors were employed to capture participants' physiological states, enabling a comprehensive analysis of their interactions with the cobot.Specifically:

Participants.
For the experiments, a total of 21 individuals took part, 17 were male and 4 were female.All participants volunteered for the study through an online scheduling tool and were university students.The main inclusion criteria have been: no prior experience in cobot programming, no significant medical history, age over 18 years old, and having an intermediate level of English knowledge (B1).Participants have been instructed to avoid substances like coffee, nicotine, and alcohol on the day of the experiment, as these could potentially affect brain activity and stress-related physiological signals.Each participant filled in a consent form.

DATASET
In this project, a dataset named SenseCobot, comprising the collected physiological signals from the participants has been developed.Each file in the dataset is in .csvformat and, to facilitate its use, the data have been organized into the following main folders, based on the type of the collected signal: • Additional_Information

• IBI_Empatica_Signals
• Video_Baseline • Video_Tasks Within each folder, .csvfiles are presented and nominated following this format: Signal_Type_Task_N_P_M, where 'N' represents the baseline or the task number (from 1 to 5) and 'M' the participant number (from 1 to 21).This approach facilitates the selection of signals collected during the same task execution across all participants, or signals collected by a specific participant across all the tasks.The Additional Information folder was conceived to contain all the necessary information to interpret and correctly use the dataset.The INFO.txt file explains how to interpret the data of the other files of the folder, specifically:

•
Movement_Type.csv summarizes the participant's choices regarding the cobot manipulation modality (i.e., jogging, freedriving, mixed).This could be useful in evaluating motion artifacts in EDA and GSR signals; • NASA_TLX.csvreports the ratings derived from questionnaire responses; • Participants_Information.csv comprises basic participant details, including age and gender;

•
Tasks_Duration.csv contains tasks duration time in minutes.In the same way, each signal folder contains an Info_.txtfile that describes the contents of the .csvfiles.For a clearer comprehension, they are also reported below.The files within the ECG_Shimmer3_Signals report the: • Timestamp, i.e., the recording period of the experiment; • ECG CAL, i.e., the calibrated values of Electrocardiography measured between electrodes, using the milliVolt (mV) unit.The EDA_Empatica files report the:

•
Timestamp; • BVP (Blood Volume Pulse) obtained from a photoplethysmography sensor; • EDA, expressed in microSiemens (µS), which conveys the skin's electrical activity, fluctuating based on the subject's stress level; • HR obtained from BVP; • TEMP, i.e., body temperature in °C; • ACC_X, ACC_Y, and ACC_Z, obtained from the accelerometer along the X, Y, and Z axes.The EEG_Enobio20 files contain the following information: • Timestamp; • Data collected from the 20 channels, expressed in MicroVolt (μV).The Emotions files report the following information: • Timestamp; • Raw data of the recorded emotions; • Measurement Proportion value, that is the ratio of the number of successful registrations during task execution to the number of successful registrations during the baseline, used as reference.This was defined since emotion registration relies on the recording of the subject's face, which can be interrupted by occlusions and signal losses during dynamic activities.Based on the Measurement Proportion value of 100 for the baseline, a value exceeding 100 indicates that during a specific task, the face was recorded for a number of measurements higher than that of the baseline and a value below 100 implies the opposite.GSR_Shimmer3 folder's files report: • Timestamp; • GSR Resistance CAL, i.e., resistance data from the Shimmer 3 GSR sensor expressed as KiloOhm (kΩ); • GSR Conductance CAL, i.e., conductance data from Shimmer 3 GSR sensor expressed as microSiemens (μS).HR_Bangle files contain information regarding the: • Timestamp; • HR derived from photoplethysmography sensor; • Confidence, a parameter that gives information about the quality of measurement.It is expressed on a scale from 0 to 100, with 0 being the least accurate and 100 being the most accurate.IBI_Empatica files include information on: • Timestamp; • IBI (Inter Beats Intervals), i.e. the distance between successive peaks of the blood volume pulse expressed in seconds, obtained from Empatica E4 device.An appended 'Label' column has been included in all the files except for the baseline.This column contains the results of the NASA-TLX subjective questionnaires, reporting "STRESS" if the values obtained from the sum of the two response scores are equal to or greater than 7, and "NO-STRESS" if the score is less than 7.Moreover, in the Baseline files an additional column named 'SourceStimuliName' has been added.This column reports the name of the video (see Section 2) visualized in the corresponding timestamp.For completeness, two additional folders have been added:

•
Video_Baseline folder contains videos used for the Baseline creation downloaded from OpenLav and MoodyDigital platforms; • Video_Tasks folder contains a video regarding task execution and experimental set-up to facilitate comprehension.SenseCobot could help other researchers and practitioners in robotics to enhance their understanding of HRC, assess cognitive workload, and optimize collaborative robotic systems during programming tasks.Ultimately, this dataset could drive progress in creating more user-friendly programming interfaces, new predictive machine-learning models to monitor stress levels in real-time and promoting efficient human-robot collaboration across various applications.All data in the SenseCobot dataset have been encrypted with SHA 256 cryptography.SenseCobot has been uploaded to the repository Zenodo (DOI: https://doi.org/10.5281/zenodo.8363762);the language used has been English.An open-source policy was used in publishing this dataset, according to the "Creative Commons Attribution 4.0 International" license.If using SenseCobot dataset partially or completely, you are asked to cite this article and the authors.

USAGE NOTES
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

1 .
Shimmer 3 ECG with five electrodes has been used to measure electrocardiogram; 2. Enobio 20 EEG has been used to measure electroencephalogram.In particular, 17 EEG channels, 2 EOG ocular electrodes, and a reference EXT channel have been used, according to the 10:20 standard [12]; 3. Shimmer 3 GSR has been used to measure Galvanic Skin Response (GSR).It was equipped with two electrodes positioned on the index and middle fingers of the nondominant hand to reduce motion artifacts and ensure greater freedom of movement; 4. Empatica E4 wristband sensor measures several signals including the ElectroDermal Activity (EDA) of the skin.It has been placed on the subject's non-dominant hand wrist, with an inverted orientation to align the optical sensors with the area containing the highest concentration of blood vessels; 5. Bangle JS 2, a smartwatch capable of recording the Heart Rate (HR) via a photoplethysmography sensor has been placed on the wrist of the dominant arm; 6. AFFDEX 2.0 module of IMotions software was used to monitor real-time facial expressions.This module comprises a convolutional and recurrent neural network that has been trained to capture the 3D position of the user's head and recognize 7 fundamental emotions: joy, anger, fear, disgust, contemplation, sadness, and surprise.Furthermore, it records other additional features such as neutral expressions, valence, attention, and confusion.The IMotions platform has been used not only for emotion recognition but also for data collection and synchronization of the data from Shimmer 3 ECG, Shimmer GSR 3, and Enobio 20.