Towards a Multimodal Synchronized System for Quantifying Psychophysiological States in Canine Assisted Interactions

Canine Assisted Interactions (CAI) are widely used to provide therapeutic benefits to human participants in various contexts (e.g. cancer-related fatigue, post-traumatic stress disorder treatment, child reading facilitation, etc.). Despite its widespread adoption and use, questions remain about the outcomes for humans and animals involved in these interactions. Previous attempts to address these questions have suffered from core methodological weaknesses, including insufficiently objective approaches and lack of focus on the canine perspective. Using a novel integrated system composed of custom-designed and commercially available wearable devices, we present a first of its kind study to collect simultaneous and continuous physiological data from both of the CAI interactants. Our repeated measures pilot study also combined this with a novel dyadic behavioral coding system and short-, and long-term surveys. We evaluate these multimodal data streams independently, and further correlate these psychological, physiological, and behavioral metrics to better elucidate the outcomes and dynamics of CAIs. Altogether, this work takes a significant step forward on a promising path to our better understanding of how CAIs improve well-being, and how interspecies psychophysiological states can be appropriately measured.


INTRODUCTION
Canine Assisted Interactions (CAIs) are a class of widely adopted complementary and alternative medicines that utilize interactions with trained dogs.Like all Animal Assisted Interactions (AAIs), it aims to improve quality of life or to affect specific clinical endpoints (e.g., blood pressure, cortisol, etc.) for human participants.While some studies show no or neutral effects, CAIs have also been shown to have certain benefits for humans, [38].Many of these positive effects are attributed to bonding between the interactants or second order effects of the interactions (e.g.exercise, external focus, etc), among other things [38] In trying to better understand the nature and source of the observed benefits, CAI researchers have recently been moving towards objective and quantitative evaluative methods and away from more qualitative, subjective approaches.However, both the tools for and targets of this quantification are lacking.
While the use-case of dogs interacting with humans is the most common pet therapy, many studies focus only on quantifying the human element, generally neglecting the dog's perspective and limiting the depth of interspecies interaction investigation possible [61], [56], [24], [45].This not only has ethical implications-in the event that the selected therapy negatively impacts the dog-but also affects the quality of the human's therapy which is highly dependent on the well-being of the therapy animal [38], [48].Similarly, CAIs tend to focus on general assessments of quality of life, but the high variability in the measures used and the outcomes observed in these assessments can partially be attributed to the vagueness of the typical quality of life concept.To address these first two concerns, we propose switching to a dyadic psychophysiological perspective.Psychophysiology (PP) generally refers to the idea that mental and emotional processes have detectable physiological correlates, and provides a more solid theoretical framework for objective interpretation of quantitative CAI data [19], [26], [95], [53].Additionally, by focusing on both members of the interacting dyad, this perspective allows for direct or comparative measurements that the general quality of life approach does not-like those from human or animal subjects with limited communication or non-existent survey response capabilities [90].
When considering quantitative data collection, some CAI researchers incorporate biochemical assays (e.g.measuring oxytocin, vasopressin, or cortisol) and electronic monitoring devices (e.g.measuring heart rate or blood pressure) into their studies, representing a major step in the right direction.As CAIs can range in duration from 10 minutes to 16 hours and in activity from quiet stroking to vigorous physical movement, many of the current tools confine measurement to pre-intervention and post-intervention data collection and further limit these collections to clinical or research settings [38].Common psychophysiological measurements using the available biochemical and electronic monitoring technologies can be very physically invasive and also tend to significantly impact or obstruct CAI activities.To address these concerns, we deploy a study design that eliminates the need for biochemical analyte collections, and utilizes wireless wearable electronic monitoring systems developed by our research group for the continuous, free-range, and non-invasive measurement of human and canine physiology [92], [62], [23], [18], [3].In this paper, we present a pilot study using the aforementioned wearable tools for quantitative psychophysiological analysis of interspecies CAI dyads.Though nascent, this work contributes to field efforts aiming to better include and quantify the canine perspective in CAI research, and to improve animal-centered emotion recognition technologies for real world deployment [37], [76], [15].The tools, methods, and results in this paper may eventually enable researchers to better and more consistently connect CAI inputs to outcomes, to identify relevant psychophysiological states in dyad members, to conduct studies validating CAIs as viable complementary therapies, and to increase the benefits of CAIs for humans and animals alike as they interact in various contexts.The products of this work may also significantly bolster studies in other human-animal interaction scenarios by laying the foundational principles for free-range human and animal data collection beyond the research environment.Altogether and regarding its contribution to the ACI community, this article presents a unique study deploying multiple wearable systems on two interactants in a typical CAI while synchronously collecting multimodal psychophysiological data and analyzing it in comparison to simultaneous survey and behavior coding ground truths.

Theoretical Frameworks
Our approach is motivated by the core psychophysiological framework that suggests mental and emotional states (such as stress, bonding, flow, etc.) have physiological correlates in animals that are context-and stimulant-dependent [71], [3], [18].Affective states are multifaceted events that recruit bodily systems from the neural to the endocrine and are best approximated by fusing and correlating multimodal data streams from several related sources [71], [3], [20], [66].As posited by several CAI mechanistic hypotheses, we assert that positive human-animal interactions can lead to dyadic relationships which can then encourage humananimal bonds via mutually beneficial quality time and positive contact [41], [39].Finally, we see potential in behavioral and physiological synchrony as a burgeoning metric of bonding between species [70], [74] and also see potential in heart rate, heart rate variability, and physical activation as relevant indices of human and canine well-being [70], [34], [89], [20].

Study procedure
Our pilot study included a convenience sample of 8 adolescent/young adult humans (female = 62.5%) and 4 canines (female = 25%; Breeds = Shih-Tzu & Maltese mix, Pitbull & Lab mix, Pitbull, and Yorkshire Terrier.)The subject in/exclusion criteria were as follows: [1] At least one of the human participants must have owned the participating dog for 6 or more months; [2] The dog must tolerate both collars and harnesses well; [3] The human must be willing and able to wear devices on both wrists and on the chest for roughly an hour [68]; [4] The human subject must be able and willing to complete both written/online surveys; and [5] Both members of the dyad must have been able and willing to come to a dedicated NC State University research lab space for data collection on at least two different days.Altogether, the recruited human and canine subjects, variously paired, completed 22 experimental day sessions total.
The pilot test interactions are unstructured and largely humanseated/non-ambulatory interactions with the dog (e.g.talking to, touching, grooming, toy play, treat giving, commands, etc.) without the researcher in the room.It was left to the subject to determine whether to keep the dog leashed during the interaction and most, not all, opted for this setup [82].As described later, survey instruments were administered before and throughout the experimental sub-sessions.The 10 minute interaction sessions are couched, before and after, in neutral sessions for the human where he sits quietly alone and relaxes (e.g.reads a book, meditates, listens to music, etc.,) and removal from the research space for the dog by the researcher [Figure 1].These 5-10 minute neutral sessions serve to both reset the human's experience and to provide multiple same day comparative baselines as features of emotions are relatively non-stationary [45], [72], [69].While some canine subjects rested during the human neutral sub-sessions, the official baseline for the dog occurs during a separate set of 5-10 minute periods where they wear the physiological equipment and come to a natural rest state (i.e.relaxed crouching with head down or otherwise lying down fully) in the presence of their human owner and the researcher.In keeping with the field best practices, the evaluation methods are mixed, including physiological data collection, human subject surveys, and behavior coding [77], [26], [36], [86], [57], [14], [43].All test procedures were approved by the NC State University Institutional Review Board and the school's IACUC committees.
As previously discussed, we are interested in both human and canine subjects' responses to the interaction so both wore physiological data collection equipment in this pilot study.The systems used included the [1] HET chest patch, HET wrist watch, and Empatica E4 for humans, along with the [2] GEB smart collar and GEB harness for dogs [77].Each device has been introduced by our research group previously or is commercially available for scientific research [92], [62], [23].These devices were selected for the biological signals acquired, their high sampling rates, their stability during movement, their positive ergonomic profiles, and their relative ease of use.Together, they make up the first synchronized system for interspecies interaction measurement.As part of this human-canine interaction research setup and for post-hoc behavioral coding, two or more smartphone video cameras were used to capture all angles of the research space during the interaction and neutral sub-sessions.

Analysis design
2.3.1 Epoch Selection.Interested in the internal dynamics of a CAI session, we first selected a repeatable time period upon which to focus our analysis.We used the information from Cowley et al. 2016, Cacioppo et al.'s Handbook of Psychophysiology, and other relevant reviews to determine the time intervals of interest for each collected signal, and to select the most appropriate epoch length to track the desired changes in physical phenomena across signals [86], [1], [46].These papers showcase that, for several relevant affective measures, phenomena changes could be reasonably measured on the 5 second to 30 second to 1 minute time scales and, several papers in the canine literature had 5 second to 15 second to 5 minute sliced timeframes [44], [35], [97], [86], [64].As such, we selected 10 second epochs to capture the fastest changes (e.g.arousal via inertial measurement units,) though we recognize (a) that significant changes in heart rate were likely to occur somewhat slower than movement activity fluctuations and (b) that other metrics like skin temperature were likely even slower.However, this standardization across metrics was necessary for our proposed analytical approach, not uncommon in the scientific literature, and still reflected appropriate changes across each metric.

Behavioral Coding Approach.
For human-animal interactions, one of the best, validated behavior coding paradigms is the Observation of Human-Animal Interaction for Research (OHAIRE) which provides a more objective rubric for dyadic assessment [34].Using one-zero interval sampling, this schema tracks facial, verbal, and other physical indicators from each interactant and from the interaction as a whole before applying standard comparative statistics.Even with this tool especially created and often used for evaluating HAIs, there exists incredible variability in the behavior coding tools used and there is little consensus on which coding schema is most appropriate for assessing psychophysiological states in CAIs [82], [21], [32], [54].Other approaches include behavior counting, which begins with determining time point and emotional state behaviors of interest as well as how they will be analytically interpreted [91].These behaviors-and the time points or time ranges at which they occur-are then demarcated in software tools like BORIS or ELAN, spreadsheets like Excel, and/or hand written notes, before general analysis [29], [28].Another approachreferred to as Qualitative Behavioral Analysis-has strong support in the social sciences and involves integration of a human's holistic perception of a subject to produce descriptors like "relaxed" or "frustrated" [91], [5], [7].In other words, if behavior counting can be understood as a quasi-objective observational approach, Qualitative Behavioral Analysis is well described as a quasi-subjective perceptive approach.
Our behavior coding approach-herein referred to as psychophysiological state assignment (PPSA)-is a quasi-subjective approach similar to Qualitative Behavioral Analysis, and borrows several elements from the OHAIRE approach as well.It is informed by extensive evaluation of the CAI literature's coding schema to isolate reliable indicative behaviors of affective and affiliative states for each species involved.PPSA then involves perceptive coding of each interactant into positive, neutral, or negative psychophysiological states for successive, non-overlapping 10 second epochs throughout the session.To minimize bias and maximize consistency, this coding was done by three raters, two of whom were previously fully trained in the OHAIRE system [12], [83], [32], [63].Using Cohen's Kappa value as a measure of inter-rater reliability in post-hoc video coding, the three raters were above the common 80% agreement standard in human-animal interaction studies, scoring 89.9% and 95.7% for humans and for canines, respectively [89], [34].PPSA preserves the temporal benefits of Qualitative Behavioral Analysis and allows raters to use any composition of descriptors to inform assignment to one of the three possible psychophysiological states.These assignments, in turn, can be represented as computer manipulatable, numerical variables: -1 for negative states, 0 for neutral states, and 1 for positive states.It is important to note that these state labels are meant to represent clear regions along a spectrum from negative to positive psychophysiological state, whereas normal Qualitative Behavioral Analysis labels are not necessarily similarly interrelated.It is also important to note that our and other researchers' interpretation of behavior is limited, and that disambiguating between subjects' true states and consensus views on what observed behaviors indicate is beyond the scope of this study.Speaking to affective state, psychological surveys are our gold standard ground truth before and after subsessions, while this PPSA behavior coding approach serves as a good, semi-continuous ground truth for the duration of interactions and for non-conversant canine subjects.Subsequent analysis of the behavior coding data utilized basic statistical averages and simple percentages with appropriate exclusion of indeterminate epochs.

Survey Selection & Analysis.
Having done extensive reviews of the literature, there were several options for relevant survey instruments considered (for an exhaustive list, see Appendix 1 in Wilson et al. 2012) [93].Six surveys were used in this study for primary comparison and as ground truth options for certain physiological data collected: i) Canine Behavioral Assessment & Research Questionnaire (C-BARQ); ii) Monash Dog Owner Relationship Scale (MDORS); iii) Self-Assessment Mannikin (SAM); iv) Positive and Negative Affect Schedule-Short Form (PANAS-SF); v) Human Ergonomics; and vi) Canine Ergonomics.The non-Ergonomic shortterm surveys, SAM and PANAS, were completed by hand before, between and after experimental day sub-sessions and are capable of measuring short term fluctuations in valence, arousal, positive affect, and negative affect or anxiety [55], [78].These four parameters from these two instruments were the closest to and best measures of our desired conceptualization of psychophysiological state that i) were also available as relatively brief psychological surveys, ii) complemented the affective inferences to be made from our physiological data, and [68], iii) were robustly validated in the literature for our and other use cases [10], [65], [26], [50], [22], [11].The human and canine surveys for ergonomics were internally developed and were completed at the very end of the experimental day by the interacting human subject.The C-BARQ and MDORS long-term surveys were completed at the human subject's leisure outside of experimental days.These targeted both, in MDORS, a common measure of human-canine relationships and, in C-BARQ, a standard evaluation of the dog's general behavior [73], [25], [79].It is important to note that all human subjects were required to complete the MDORS, but only the dog's primary owner completed the C-BARQ.In addition to following the survey instrument developers' recommendations, the survey data analysis used basic average statistics as well as the Wilcoxon signed rank test for general data comparisons and repeated measures data.We considered a 2-sided p-value of <0.05 to be statistically significant.The ergonomics surveys and the C-BARQ behavioral survey results are not included in this paper's analysis and will be discussed elsewhere.

Physiological Data Analysis.
Signal Selection & Calculation.All physiological signal metrics were selected upon extensive review of human and canine psychophysiology to be responsive to interaction and indicative of affective states.Using the selected epoch time frames and the classic psychophysiological theoretical framework, we took the raw physiological devices' data and completed a preprocessing step which included an initial data check and removal of outliers.We then filtered each signal using bandpass Butterworth filters, completed a normalization step, and achieved temporal synchronization across the multimodal device signals as well as with the behavior coding output [75].The second core analytical step includes two forms of metric extraction: average metric by epoch (ME) and rolling window average by epoch (RE).From the accelerometer signal (also referred to as the activity signal or inertial measurement unit or IMU,) we directly calculate the average, minimum, and maximum acceleration by epoch along each spatial axis, before calculating the mean amplitude deviation (MAD) by axis and the integral modulus of acceleration (IMA) across dimensions [94], [6], [52], [88] [87], [30], [2], [9], [13].Similar IMU metrics to those described above were also supported for analysis of canine activity [4], [31], [81].From the electrocardiography (ECG) signal, we used ECG waveform R peaks to extract the interbeat interval (IBI) using the "Pyphysio" toolbox in Python 3.7 via Google Colaboratory Jupyter notebooks [8].With IBI serving as the basis for all other ECG metrics, we then determined heart rate (HR), and three additional heart rate variability (HRV) metrics in the time domain.These included the standard deviation of the IBI of normal sinus beats (SDNN), the root mean square of successive differences between normal heartbeats (RMSSD), and the quotient of SDNN and RMSSD [44], [42].Briefly, RMSSD estimates "vagally mediated changes" in HR while SDNN tracks both parasympathetic and sympathetic nervous system activity contributions to the recorded HR [80].As noted, IBI and HR extraction is standard for ECG analysis, and the three HRV metrics were well supported for both human and canine evaluation of valence, stress, and other psychophysiological constituent states [44], [51], [35], [97], [66], [3], [42], [64], [73], [80].From the skin temperature (ST) signal collected by the Empatica E4, we simply determined the average ST value by epoch [40], [14].From the electrodermal activity (EDA) signal, we extracted the average and maximum EDA values to characterize the combined galvanic skin response.We also ran this signal through the developer's EDAExplorer online platform to remove artifacts, to detect the phasic skin conductance response (SCR) peaks for short term stimuli, and to differentiate the tonic skin conductance level (SCL) long-term baseline [4], [67], [84].The EDA analysis in this paper focuses only on the SCR short term stimuli responses.REthe rolling window metric extraction-calculates the same metrics from the same preprocessed signals as the ME approach but, rather than a sequential averaging by 10 second epoch, it uses a centered, 60 second, rolling window to produce a 10 Hz output signal, (e.g. from a 200 Hz chest HET ECG signal, RE produces a 10 Hz average heart rate signal).Though we extracted a 10 Hz RE signal for all of our metrics across all 5 devices, the RE output is expressly used herein for correlational analyses of synchrony only.The selected output frequency of 10 Hz was based on the human and canine torso signals held in common (i.e.chest ECG and chest IMU on both subjects.)While all signals or metrics were used and investigated throughout the analysis, for spatial economy, we present a meaningful subsample of signals in this paper.
ME signals are processed as appropriate to produce summary tables and heatmaps displayed throughout.For each experimental session, we also calculated the difference between epochs by metric, and marked the increase or decrease of each metric over the entire session.Then, referring to our literature review, we assign a direct or inverse relationship from that metric to the expected effect on psychophysiological state, and code epochs throughout the session for their positive or negative contributions to said state.All metris are also associated with and grouped according to valence (herein also called stress) or arousal.Afterwards, these heatmaps were inspected visually for vertical and horizontal patterning.
Overall Methods and Interpretation.Again, the products of the aforementioned metric extraction or ME step contain the biometrics averaged over each epoch.We read all of these ME results from the various devices (e.g.HET, Harness, Empatica E4, etc.) and synthesize them by performing 6 epoch averages at the beginning, the middle, and end of each subsession, representing key minutes from the dyad's interaction.For overall reporting of physiological data by signal, we used average test statistics and the Wilcoxon signed rank test to compare between session types, and the Pearson correlation test statistic for comparisons between multimodal data averages across interaction subsessions [18], [3], [44].
Given the large number of physiological signals collected from each dyad, there is some nuance to their individual and joint interpretation.Many/most reported sources find that increases in heart rate, electrodermal activity (e.g.skin conductance responses), and skin temperature are generally understood to indicate elevated arousal in humans [18] , [77] , [60] , [59].Additionally, increases in HRV time domain metrics (specifically, SDNN and RMSSD increases) indicate a decrease in stress and potentially more positive states/emotions [33] , [42] , [18] , [3] , [47].For interpretation of canine physiological signals, increased HR often indicates higher arousal and increased HRV also indicates more positive canine states [20] , [64] , [97] , [35] , [44] , [51].For both species, we assume that sustained increases in average movement in 3 dimensions over a given epoch of time indicate more arousal and, thus, less calm states for that subject.We follow these broad field guidelines for interpretation of our results but note that further independent validation of these directionalities for each species is beyond the scope of this work as there is no "one-to-one relationship between emotional changes and autonomic activation" [49] .Additionally, the debate surrounding a complete psychophysiological theory of emotional states and their interpretation for humans, not to mention animals, is ongoing [27], [18] , [71] , [60] Lastly, we acknowledge that psychological surveys and our behavior coding approach, by design, produce state based outcomes while the physiological approaches can only produce directional outcomes in comparison to previous time periods' signals.
Synchrony Methods.The wearable systems were located on both human wrists and on the human chest as well as on the canine's torso and neck.As such, we only consider the torso systemsrepresenting the signals shared between species-for synchrony investigations of bonding.While of potential interest for exploring previously unknown interrelations and for identifying relevant movements like dog petting, for example, the data from the other subsystems either has no direct correlate in the opposite dyadic counterpart's subsystems or would necessarily result in spurious data (i.e. it is likely not valid to correlate human hand motion to dog neck motion.)Additionally, psychophysiological measures closer to the center of mass are generally understood to be less prone to movement artifacts [18], [6].Using an 18 epoch (i.e. 3 minute) RE slice taken from the middle of each interaction subsession, we use two approaches to determine interactional synchrony as a proxy for bonding.First, the overall Pearson's correlation for our three key ECG metrics (e.g.HR, SDNN, & RMSSD) and one key activity metric (e.g.IMA) is calculated [58], [96].We further test the metrics' interspecies interaction via the dynamic time warping methodology, to track these key time series' data alignment in general and when assuming temporal asynchrony [96], [16].

RESULTS
Given the aims of this study, we were successfully able to deploy wearable physiological measurement systems on both human and canine subjects simultaneously and continuously as they interacted.
We were also able to analyze this data to begin answering some questions of interest to the field.
Though the GEB smart collar device is integral to the synchronized system for CAI explored herein, we excluded its data from these analyses for two reasons.First, for small and large dogs, the placement of the smart collar was not found to give meaningfully different results than the torso-located harness IMU.The smart collar did tend towards more noise and exogenous movement as it was attached to loose-fitting collars.Lastly, beyond physical activity, the smart collar largely collects ambient environmental measures, which shall be the focus of future analysis work, but is beyond the scope of this paper.

General Survey Responses
For survey responses, we investigated the time and type dependencies of the valence and arousal outputs from SAM and the positive and negative affect outputs from PANAS.The four survey scales were taken during interstitial experimental periods, meaning there was no survey before the baseline session.For positive or negative affect (i.e."PA" & "NA, " respectively), larger numbers indicate more positive or more negative affect [Figure 3].For the SAM-valence and -arousal scores (i.e."V" & "A," respectively), larger values indicate more unhappiness and more calmness, respectively.Where appropriate (i.e.excluding surveys from two participants for missing, incomplete, incorrectly filled out, or otherwise spoiled survey data,) we ran the non-parametric Wilcoxon signed-rank test using the self-same function from the SciPy library to compare outcomes for neutral-type to interaction-type sessions [44], [17] , [44].For individual subsessions, some clear patterns emerge.SAM Arousal consistently increased after an interaction session on average compared to neutral sessions.A similar pattern can be seen in PANAS Positive Affect which reliably increased on average with interaction sessions.SAM Valence results by subsession are more variable, but PANAS Negative Affect indicates a reliable decrease after interaction sessions.For all neutral vs. interaction session types, we see a significant difference in SAM arousal (p = 0.043) and SAM valence (p = 0.0002).Looking at the PANAS dimensions, the full group of subjects saw a significant difference in positive affect (p = 0.0003) with no major difference in negative affect observable in this study.Overall, our study group saw significant self-reported state changes indicating more arousal, more positive valence, and more positive affect.Though decreases in negative affect were common, no significant change occurred across subjects with canine interaction.While these survey results are preliminary, they are promising and make intuitive sense for CAIs.

General Behavioral Coding Outcomes
For behavior coding, we focused on the percentages of each interaction-type session spent in each psychophysiological state by animal-human pairing, excluding periods where either subject is off screen as indeterminate.This was done as dogs were not necessarily resting during the neutral-type sessions (i.e.outside of the interaction space pacing, watching, and otherwise waiting,) and the neutral session human psychophysiological state codes by epoch were unvarying (as these subjects were instructed to sit and listen to music, read, etc.) Of course, this eliminates any meaningful comparison of behavior coding scores between session-types, though it does lend some credence to the significant differences seen between session-types for the survey results.More simply, the interaction sessions were characterized by all three psychophysiological states, for both participants, whereas neutral session results were completely neutral, for the by design.Beyond these observations, the first notable overall outcome is the high number of neutral ratings by interaction session (i.e.typically over 60% of on-screen time).This indicates that neither interactant is visibly or audibly in a positive state for most of the CAI sessions within our study [Figure 4.].As expected, negative ratings accounted for a vanishingly small percentage of the canine and human behavior codes.Characterizing the majority of positively coded epochs, dogs generally displayed more affiliative and affective behaviors in goal-oriented interactions (i.e. in order to solicit attention or treats.)While positive codes for either interactant seemed generally higher for some pairings than others, no other clear patterning emerged across all subjects.
We also applied an "exclusive nor" logic gate to the behaviorally coded scores by epoch to investigate the synchrony between dyad members, showing the percentage of the interaction session for which the dyad had the same one of the three psychophysiological state codes between the species.Epochs with either party off-screen were excluded, and were also the impetus for this novel form of synchrony analysis.Across the board, pairs spent much of the session time in the same psychophysiological state.This is likely due to the high percentage of neutral ratings for both parties in most interactions.Looking at successive sessions, there appeared to be no consistent synchrony patterns as the dyads had more situational contact.

General Physiological Data Outcomes
Figure 5 reports our average results across three time points within interaction or neutral subsessions for target signals amongst our wearable device system.Upon visual inspection, the human and canine heart rate and heart rate variability results do not indicate clear patterning across subjects or session types at this scale of analysis.For the cluster of activity data represented in the last four columns of the table, it appears that left wrist HET movement occurred much less than chest or right wrist movement, which concords with the fact of all included human subjects being right-hand dominant.Furthermore, within subsession groupings, each IMA source seems to remain relatively stable, though the differences between neutral and interaction sessions were not statistically significant.
Figure 4 also reports the Wiilcoxon signed rank probability that there is a significant difference between neutral and interaction session types for each signal presented.Of note, the canine harness HR signal, the human right wrist E4 temperature signal, the human electrodermal activity mean, and the EDA max scores differ significantly across subjects in this respect.As noted previously, the canine subjects were removed from their experimental interactant during neutral subsessions and escorted by a researcher during this time.While the dogs were not expected to also engage in neutral behavior and were free to do anything from interact with the human to rest quietly during these subsessions, these comparison results may be indicating that focused one-on-one interaction is meaningfully distinct from free range activity in this context for canine heart rate.If true, this could speak positively to the idea of at-leisure breaks being recuperative or, at minimum, positively different for therapy dogs while at work.As the skin temperature signals indicate, localized temperatures do appear to rise across subjects as the experimental sessions progress when placed on the right hand of all subjects.This is likely due to the increased physical activity with their dog interactant.The strong difference between neutral and interaction sessions indicated may be due to the relatively low baseline temperatures initially observed on average.Generally, the EDA average amplitude by epoch and the EDA maximum amplitude by epoch-both arousal indicators-seem to reliably and significantly increase during interaction sessions as expected.This comports well with survey self report findings of increased arousal after interaction sessions across subjects, previously discussed.It is worth noting here, that across our analyses and in keeping with other studies, EDA seems to be one of the more reliable and responsive differentiators between neutral and interaction sessions for human participants throughout the experiment.Upon further analysis, other signals may prove to be individually predictive or also correlated with overall affective states, but EDA metrics appear to have clear and multifaceted support between session types.

Multimodal Composite Results
To derive composite results, we took an in depth look into some CAI sessions to see how the patterning of metrics contributed to the overall outcomes.As noted before, this was done by taking the ME outputs and tracking whether they increased or decreased from epoch to epoch.Then, using careful directional indicators from the literature, we created heatmaps that represented the 3 minute segment directionality of the available valence and arousal dimensions in a bonded individual, shown in Figure 6 [97], [50], [33], [71].In this heatmap, the blue section represents metrics correlated negatively with stress while the red represents positive arousal metrics; canine metrics are below the dashed line on each dimension's chart.A solid color indicates an increase while the absence of color (e.g.off-white tinted red or blue) indicates a decrease in psychophysiological state metric for that epoch.Within each section, a dotted line separates the human signals from canine signals, as well as further "h_" and "c_" prefix demarcations for human-sourced and canine-sourced signals, respectively.The signal type (i.e.physiological, survey, and behavior coding) naming conventions follow the common abbreviations previously indicated in this paper.For these charts, blue blocks thus represent psychophysiological state increases along the valence dimension and red blocks indicate psychophysiological state increases along the arousal dimension.
Unexpectedly, we see no clear overall patterning for each subsession by type across subjects.We, however, do notice that the neutral session's human EDA mean and EDA max metrics decrease noticeably for most subjects when compared to interaction sessions.This reflects the significant change in surveyed arousal score and the strength of EDA as an arousal metric.For canines, the neutral, base-, and post-line session metrics do not reflect resting.However, looking vertically, canine valence epochs tend to show a higher degree of coherence across signals and metrics (i.e.all increase or all decrease.)These representative visual examples of the patterning within sessions juxtaposed to the survey outcomes are uniquely made available to researchers by the continuous, multimodal wearable system coupled with our experimental approach, and allow for multimodal output alignment.Taken altogether this heat-map representation, showcased in Figure 6, indicates to researchers the dynamics of the session or session slice across behavior coding and physiological signals as well as the survey outcomes that bracket the interaction.It also allows for fast visual inspection of vertical bands for signal coherence or horizontal bands for expected macro trends in certain signals (e.g.EDA signals consistently decreasing during a neutral session, canine RMSSD indicating negative experience, etc.) [44], [35], [84], [85].

Physiological Data Snapshot
The Figure 7 Raw Signal plot is a glimpse at the original ECG data for humans and canine subjects that was simultaneously produced by our multimodal system during the experiment.The brackets are marked with colored regions to show where metrics might deserve inspection since the bracket entered was a time of interesting activity.In our approach, this is useful for several reasons.First, it highlights basic, enduring differences between species like canine heart rate being faster than the human's on the whole.Second, indications from other data streams could prompt us to look at the raw and derived signals for that time period, (e.g. during becoding, a visually observed strong negative reaction in the dog vs. the giving/receiving of a treat,) for further inspection/analysis.

Behavioral Coding Subset
Like the heatmap, a synchrony table provides an interesting multimodal snapshot of the experimental data from this study.Though we couldn't show the data from all 34 interaction sessions, the table in Figure 8 showcases 2 interaction subsessions each from 3 humans total as they interact with the same dog.The arousal, valence, positive affect and negative affect survey scores do not show clear patterning based on bondedness here.However, lower MDORS scores indicate a stronger bond and as is expected, H1-C1's owner-proved to be the most bonded to the dog by survey result and H2-a friend of the dog-was less bonded while H3-a stranger to C1-was the least bonded.These differences and ordering are also directly reflected in the behavior coded amount of time each pairing spends in positive states.For both interactions presented, H1 and C1 each spent much more time in positive states than the moderately bonded pairing of H2 and C1, or the weakly bonded pairing of H3 and C1.This results in the MDORS survey score and amount of time each member of the interacting dyad spent in positive states being the measures that most closely track with expected level of bondedness.A potential counter indicator of bonding is the presence of negatively coded epochs for the canine.While there were relatively few negative states coded throughout the entire pilot experiment ( 1.6%), all of them occurred in interaction sessions between a dog and a non-bonded human (i.e. when the canine was not interacting with his owner.)Surprisingly and counter to our hypothesis, behaviorally coded epochs spent in the same state appear to be much lower in the bonded pair when compared to moderately and weakly bonded pairings.This unexpected result  actually follows from the fact that in most cases, the majority of an interaction session was spent in neutral states, leading to a very high same state % result in non-bonded interaction sub-sessions.In bonded pairings, the dog and human matched in some epochs but largely differed due to the nuances of certain interaction behavior sequences.For example, in some instances, the human would display positive affective and affiliative behaviors while the dog consumed a treat whereas the dog displayed these behaviors as the   is likely due to the significantly higher canine heart rate when compared to humans, and may also factor in certain differences in heart rate variabilities between the two interactants.These two sets of correlation results strongly hint at further exploration being needed of physiological synchronization between participants as a measure of bonding.

Multimodal Correlation Matrix
We computed a person's correlation matrix across all subjects and the full multimodal data set as an exploratory analytical approach.Focusing only on sessions where both species of subjects interact (i.e.no neutral sessions), we took the average of the middle minute of data for each behavior coding and physiological signal as well as the post interaction survey scores, to populate the matrix.This resulted in a comprehensive overlay of signal interactions across the experimental sessions and subjects [Figure 9].Of considerable note, time series HRV indicators have strong positive associations within species, as expected, but also across species.These are already considered to be some of the best indicators of psychophysiological states and could serve as a reliable indicator of interspecies interaction or bond quality in future work.The integral modulus of acceleration (IMA) showed some moderate correlations in a few signal types.For the human right wrist, the IMA was associated with skin temperature possibly indicating a heating effect of additional human movement, likely due to stroking, brushing, and other interaction specific activities.The canine chest IMA is also moderately associated with human skin temperature for reasons that are less intuitively clear.This IMA variant also associates moderately with behavior coding for the canine, and with the RMSSD heart rate variability metric.That finding may indicate that rater perception of canine state may be somewhat influenced by the dog's movement and potentially reaffirms previous findings that RMSSD is a reliable state indicator in dogs [44], [35].Lastly, though most other correlations between the multimodal signals from this experiment were weak, the human self report arousal scale was moderately associated with the positive affect self report scale.This relationship is echoed in our other analyses and potentially indicates that a contributing factor for overall positive affect in humans is the level of arousal inspired by the interaction with the dog.

CONCLUSION AND FUTURE WORK
In this work, we showed that we can use integrated systems of wearable devices to look at both human and canine interactants.We were also able to peer into the underlying dynamics of the continuous CAI interactions that lead to the macro pre-/post-survey results commonly reported in the field.Of particular note, we presented three novel multimodal data representations for potential characterization of CAIs: a subsession heatmap, a synchrony table, and a metric correlation matrix.Lastly, several of our exploratory analyses yielded interesting proof of concept results to inspire future investigations.
Overall, this pilot study confirmed common CAI field results like canine heart rate being significantly higher than humans during interactions, and humans generally reporting positive to neutral outcomes thanks to the interaction.Interrogating the physiological data collected in this study, we found that the EDA measures were the most meaningfully distinct between neutral and interaction sessions across subjects.For survey data, we saw significant positive changes in subjects' arousal, emotional valence, and positive affect with canine interaction.Counterintuitively, most all interaction time periods were rated as neutral with relatively fewer positive epochs and significantly fewer negatively coded epochs.However, we suspect that this is partially influenced by the chosen coding schema and epoch time period duration.This preponderance of session neutrality also contributed to the moderately high amount of interspecies synchrony observed behavioraly, though bonded pairs seemed to have lower levels of coded synchrony than expected.While the physiological synchrony results hint at promising associations, the results were not definitive for the four metrics interrogated (i.e.heart rate, SDNN, RMSSD, and activity IMA.) Canine surveys were not employed, but the standardized measure we used showed clear bond quality discriminatory power between owner, friend, and stranger to a dog.The independent canine results of potential interest are the associations between canine chest integral modulus of acceleration and human skin temperature, canine behavioral coding, and the dog's own RMSSD heart rate variability.Though moderate, these indicate several potential areas of follow up investigation on the canine side.Lastly, dogs only seemed to experience negatively coded epochs with unbonded human interactants as a result of a human action (e.g.sudden movement, picking the dog up, etc.) Human negative behavioral responsivity clustered around frustration when the dog employed repeated avoidance behaviors.

Figure 3 :
Figure 3: Neutral session to Interaction session summary statistics and comparisons for the SAM and PANAS surveys.Notes: SAM-V = valence; SAM-A = arousal; PANAS-PA = positive affect; PANAS-NA = negative affect; sd = standard deviation ; INT1 = Interaction Session 1 ; INT2 = Interaction Session 2 ; NEU = Neutral Session ; BASE = Baseline session ; POST = postline session ; ALL NEU = all neutral sessions ; ALL INT = all interaction sessions ; significant values noted in bold.

Figure 4 :
Figure 4: Average Behavioral Coded State across all Interaction sessions and subjects Notes: h-= for human subjects; c-= for canine subjects ; INT1 = Interaction Session 1 ; INT2 = Interaction Session 2 ; pos = positive code ; neu = neutral code ; neg = negative code

Figure 5 :
Figure 5: Physiological Data Summary Table Notes: NEU = neutral session ; INT = interaction session ; S+1m = session start plus 1 minute ; M = middle of session ; E-1m = session end minus 1 minute ; HET = Health and Environment Tracker (on human) ; wHET = HET on wrist ; cHET = HET on chest ; HAR = harness (on dog chest) ; E4 = Empatica E4 (on human wrist) ; hr = heart rate ; sdnn = standard deviation of NN intervals ; rmssd = root mean square of successive differences between heartbeats; TEMP = skin temperature; EDA Mean = average electrodermal activity by epoch ; EDA Max = maximum electrodermal activity by epoch ; EDA Peak Ct = number of peaks in epoch of electrodermal activity ; IMA = integral modulus of acceleration.

Figure 7 :
Figure 7: ECG Signal CAI with Highlighted Events