PriMA-Care: Privacy-Preserving Multi-modal Dataset for Human Activity Recognition in Care Robots

In the field of robotics, caregiving robots and personal assistants are assuming an increasingly prominent role, directly impacting human lives. Especially in healthcare domains, these systems are starting to provide continuous 24/7 care by monitoring patients and delivering real-time insights into their activities. The effective deployment of future robots relies on equipping them with sophisticated Human Activity Recognition (HAR) algorithms. Many HAR algorithms are based on Artificial Intelligence (AI) and Machine Learning (ML) models. The development of these models necessitates suitable datasets. This paper introduces a Privacy-preserving Multimodal dataset for HAR in the context of Human-Robot Interaction (HRI) for Care robots (PriMA-Care). Tailored for care robots, PriMA-Care includes 27 diverse user activities, spanning daily tasks to physical HRI, with data from 10 privacy-preserving sensors and 17 participants. PriMA-Care addresses critical gaps in existing datasets, offering a suitable resource for HAR research in care robots.


INTRODUCTION
Today, the use of robots is expanding across various domains.Many of these robots indirectly enter our lives, such as industrial [22] and agricultural [25] robots, or have only sporadic and specifc presences in our daily routines, as is the case with surgical [7] and rehabilitation [23] robots.In contrast, caregiving robots and personal assistants have garnered signifcant attention for their direct presence in human lives [17].These robots can serve as personal aides at home, assisting with daily tasks.Furthermore, in healthcare settings, these robots play a pivotal role by providing continuous care for patients and monitoring them around the clock, ofering real-time insights into their activities and situations.This real-time monitoring provides healthcare professionals with comprehensive reports on the user's health status and activities [12].
To enhance user assistance, the robot requires HAR algorithms.HAR enables the robot to discern user activities, aiding in timely assistance provision or notifying caregivers of potential emergencies.These algorithms are designed to identify and interpret user activities using the data collected by the robot's sensors [11].
AI, particularly deep learning, has gained prominence in HAR [8].The development of AI algorithms relies heavily on the availability of diverse and representative datasets [3].In the realm of caregiving robots, crafting datasets that capture human activities in various contexts becomes imperative.Creating a dataset for HAR in care robots necessitates consideration of several critical factors: (1) In continuous 24/7 monitoring, safeguarding user privacy is crucial.Privacy involves keeping personal aspects confdential, covering dimensions like personal life, information, and relationships [1].Here we specifcally address information privacy, where individuals seek control over their data [4], [24].This involves integrating privacy-preserving sensors into the robot, emphasizing the collection of data using them.
(2) Leveraging multimodal sensing and sensor fusion approaches can signifcantly enhance HAR in user monitoring [19].Caregiving robots, equipped with diverse sensors, stand as a suitable option for implementing such HAR algorithms.(3) Users engage in a variety of activities at home and in the presence of the robot, spanning routine daily tasks to physical HRI.For efective continuous HAR using care robots, it is advantageous to develop HAR algorithms on datasets that encompass a wide range of user activities.
In this paper, the PriMA-Care dataset-a comprehensive multimodal dataset for privacy-preserving HAR within the context of HRI is introduced.PriMA-Care stands out with its specifc emphasis on care robots and personal robot assistants, containing a collection of a variety of over 27 diverse user activities.These activities span four distinct categories and encapsulate scenarios where users interact with a robot or undergo monitoring by a robot.PriMA-Care is uniquely tailored for indoor environments, aligning with real-world applications in homes or care facilities.The dataset's distinctive features encompas data collection from an array of privacy-preserving sensors.The inclusion of synchronized data from various sensors in PriMA-Care facilitates the development of sophisticated multimodal and sensor fusion models for HAR.This capability may empower care robots with robust HAR algorithms, enhancing their profciency in understanding and responding to diverse user activities in real-world scenarios.

BACKGROUND
HAR holds a pivotal role across diverse applications, serving as a critical capability for interpreting and responding to human behaviors.In the realm of HRI, HAR becomes particularly crucial, enhancing the overall efectiveness of interactions between robots and users [27].This signifcance is amplifed in the context of care robotics, where HAR infuences the functionalities of robots designed for direct interaction with individuals in homes or care facilities.This recognition capability signifcantly contributes to the efciency of these systems, ranging from assisting with daily tasks to supporting caregivers in 24/7 patient monitoring [21].
While advancements have been made in HAR within HRI, a noticeable gap exists, specifcally concerning HAR in the context of care robotics.Existing studies in this feld often concentrate on HAR in industrial [10], [6], agricultural [2], and social [13] robotics.The limited research available on HAR in care robots heavily relies on specifc sensors such as RGB cameras, providing a restricted perspective on activity recognition [26].
In general, many datasets for HAR, particularly in HRI, predominantly utilize a narrow set of sensors.Most datasets are based on vision cameras like RGB and depth [15] or solely wearable sensors [14], or a combination of these [9].Notably, the over-reliance on RGB cameras poses potential risks to user privacy [5], a signifcant limitation that pervades many datasets.User privacy assumes heightened signifcance in the context of HAR for care robots, given their integration into people's daily lives [16].Prioritizing user privacy establishes trust between the robot and the user, ultimately enhancing the quality of these robots' usage.
Furthermore, modern robots can leverage a variety of sensors, opening avenues for the application of advanced methods such as multimodal sensing and sensor fusion in HAR.These methods, combining diferent sensors, have demonstrated success in improving the accuracy and efectiveness of HAR systems [19].
To the best of our knowledge, there is currently no dataset in the existing literature specifcally tailored for care robots, comprehensively addressing various aspects of their deployment in home or care settings.Additionally, as previously mentioned, despite the demonstrated efectiveness of multimodal sensing approaches for HAR, there is a scarcity of datasets that incorporate diverse sensors.This scarcity limits our ability to explore the combination of diferent sensors and study various multimodal and sensor fusion approaches for HAR.Furthermore, the importance of collecting datasets with a specifc emphasis on privacy-preserving sensors has not received adequate attention in prior research.
To address the aforementioned limitations related to the absence of a comprehensive dataset dedicated to HRI in care robots, we have developed the PriMA-Care dataset.By mitigating the dual challenges of enabling a multimodal sensing approach and incorporating a variety of sensors, with a particular focus on privacypreserving sensors, we aim to fll this gap.The PriMA-Care dataset comprises four categories of user activities: normal daily life activities, physical HRI, users commanding the robot, as well as a spectrum of user movements in the proximity of the robot.In total, the dataset includes approximately 27 distinct activities.
The dataset is structured in a robot-centric format, integrating data from 10 diferent types of sensors (comprising 21 sensor outputs) in a synchronized manner.This synchronization facilitates the seamless and accurate integration of sensor data.The upcoming section provides detailed insights into the covered activities, the robot equipped with all the utilized sensors along with their hardware specifcations, the experimental data collection setup, the user recruitment process, and a comprehensive overview of the technical specifcations of the dataset.

THE PriMA-Care DATASET 3.1 Applied robot and sensors specifcations
The PriMA-Care dataset was collected with the primary objective of facilitating the development of HAR algorithms in the context of HRI for care robots.Consequently, the entire dataset collection process was conducted within a robot-based system.This implies that all sensors employed for data collection were seamlessly integrated into a robotic framework, and the data collection procedure was done through the robot's software.
In the PriMA-Care dataset, the TIAGo robot [18], purpose-built for deployment as a care and assistive robot in domestic and care settings, served as the designated platform.All sensors except the wearable sensor utilized in data collection had their data acquired through this robotic system.The wearable sensor, characterized by its non-real-time structure and proprietary software, presented constraints that precluded the collection of its data using the robot's software.Table 1 provides a comprehensive overview of all sensors incorporated into the dataset, accompanied by their respective specifcations.These sensors encompassed diverse technologies: RGB and Depth Cameras: A RGB-D sensor was employed, providing separate outputs for each camera in diferent formats.One should note that the RGB sensor has only been used for labeling the dataset, and its data will not be included in the dataset.Force/torque Sensors: The robot's wrist was equipped with a force/torque sensor that measured the force applied to the robot in three directions separately.
Encoder: The robot's arm, consisting of 7 joints, each with an encoder, contributed to the dataset during physical HRI activities.Figure 1 illustrates the TIAGo robot utilized in this research, along with the external sensors incorporated into the dataset.

Activities
The primary aim was to encompass various types of activities that users may engage in when interacting with a robot or being monitored by a care robot in the PriMA-Care dataset.By expanding the spectrum of activities within the dataset, the development of more advanced HAR algorithms becomes feasible, enhancing the robot's capability for a broader recognition of user activities.To achieve this goal, four types of activities have been integrated into the dataset, classifed into the following groups: 1. Daily life routine activities, 2. Physical HRI, 3. User commanding the robot with specifc hand gestures, and 4. Detecting diferent types of user movements in the robot's vicinity.As evident and previously mentioned, these activities encompass scenarios specifcally focused on the interaction between the robot and the user (physical HRI and commanding the robot) or instances where only the user is being monitored by the robot.In total, 27 diferent activities have been included in the dataset.Table 2 provides a detailed overview of all activities and the respective categories to which they belong.Figure 2 displays examples of activities with their corresponding ID numbers from Table 2 in various sensors performed by a subject.Most activities are self-explanatory; however, clarifcation is provided for some ambiguous ones: Activity 4: User treats the robot as a companion, holding its hand and attempting to move it.Activity 5: User simulates an attempt to turn of the robot.Activity 6: User deliberately interacts with the robot in a risky manner, maintaining close proximity.For user-commanding activities, distinct hand gestures and movements, such as rotations, wrist motions, and arm openings, represent specifc commands.Activity 18 involves users mimicking walking difculties.Activities 20 and 21 depict users employing a walker and wheelchair, simulating scenarios relevant to patients with walking challenges in care facilities.  2 performed by a subject, captured by various sensors.Due to space constraints, only a subset of activities and sensors is included.

Participant Demographics and Recruitment
In total, 17 participants, including 9 females and 8 males, actively participated in the dataset.The participants, ranging in age from 22 to 74 years, exhibited an average age of 30.6 years.Recruitment eforts were conducted through a variety of channels, including advertisements on Facebook, within university buildings, and through individual outreach.All recruited participants were considered fully able users without any cognitive or physical disabilities.
The data collection was done in a lab at the Department of Informatics at the University of Oslo, Norway.The study has been registered and evaluated by the ethical guidelines from the Norwegian Center for Research Data (NSD) (Ref.No.: 863469).Before commencing the experiments, participants received thorough information about the data collection process, covering details of activities, the sensors and their technical specifcations, and the procedures for data storage and sharing.Subsequently, participants provided their consent by endorsing an informed consent form, wherein they were asked to consent to three main topics: 1. allowing the collection of their data by the specifed sensors, 2. selecting the activities they wished to perform, and 3. indicating their preference for making their data publicly available-whether no data, a portion of the data, or all data.Participants retained the right to withdraw from the data collection at any point without the need for justifcation or facing negative consequences.
Upon completing the designated activities, each participant received a gift card of 150 NOK as a token of appreciation.The overall duration of the entire data collection process ranged from 30 to 60 minutes per person.The data was collected on a dedicated computer and stored on the Service for Sensitive Data (TSD) (Ref.No.: p1582), owned by the University of Oslo.

Technical specifcations and details
PriMA-Care was collected within a robot-based system using the Robot Operating System (ROS) [20].Leveraging ROS's synchronization capabilities for diverse sensors, the data synchronization process became seamless.The rosbag package in ROS facilitated simultaneous data collection across sensors.
PriMA-Care was collected using a computer with Ubuntu 18.04.Each sensor's data was published as a rostopic, and a launch fle was executed to simultaneously send all sensor data via LAN cable to a computer, saving it in rosbag format.Raw data across all activities and users totaled 2.1 TB, compressing to around 600 GB.Recording durations for users varied from 20-30 minutes, with pure activities expected to be around 15-25 minutes.The dataset is currently saved in rosbag format.The wearable sensor data is in the JSON format and is synchronized with other sensors collected based on ROS.
Open accessibility is a core goal for PriMA-Care.Of the 17 participants, 15 consented to making all or part of their data publicly available.Upon approvals and platform preparations, the dataset will be shared publicly, expected to be around 300 GB.To safeguard participant data, a request-based access system will be implemented, ensuring responsible use.People seeking access will submit requests, preferably with indicating their purpose, and the links to the dataset will be shared with them via email.Anticipated to be publicly available from mid-2024, access details will be outlined in the Robotics and Intelligent Systems (ROBIN) group's webpage.Presently, a video recording of diferent sensors from one of the participants is available through this link.

CONCLUSION AND FUTURE WORK
This paper introduces the PriMA-Care dataset, designed specifcally for HAR in the realm of HRI within care robots.With an array of privacy-preserving sensors, multimodal sensing capabilities, and a comprehensive spectrum of categorized user activities in indoor settings, the dataset stands out.Its synchronized sensor data makes it particularly apt for exploring various AI and ML-based multimodal sensing and sensor fusion methods for HAR.
While initially collected for care robots, the dataset's versatility extends beyond, encompassing diferent sensor types commonly used in various robotic felds or ambient sensing applications.The paper explains the dataset's sharing process, slated for public availability in mid-2024, catering to researchers in robotics, HAR, HRI, and multimodal sensing and sensor fusion felds.
Ongoing eforts aim to enhance the dataset by including more data on common activities within the context of HRI for care robots.A future iteration will prioritize predicting future user activities in the HRI context, moving beyond mere activity recognition.

Figure 1 :
Figure 1: The employed robot during data collection and the various sensors utilized in the process.a: TIAGo robot and its manipulator's internal encoders, b: RGB-D camera, c: Thermal camera, d: 3D Lidar, e: Torque/force sensor, f: Ultrawideband, g: 2D Lidar, h: Wearable sensor.

Figure 2 :
Figure 2: Selected activity samples with corresponding ID numbers from Table2performed by a subject, captured by various sensors.Due to space constraints, only a subset of activities and sensors is included.

Table 1 :
Specifcations of Utilized Sensors This sensor delivered color and monochrome outputs, both of which were recorded in the dataset.
3D Lidar Sensor: The utilized 3D Lidar sensor could collect data in 16, 32, 64, and 128 channels.The PriMA-Care dataset specifcally captured the 128 channel output, while the 16, 32, and 64 channel outputs could also be extracted.Wearable Sensor: Worn on the user's right hand wrist, this sensor collected accelerometer and gyroscope data of the user's right hand.