Abstract
Considerable attention has been paid to physiological signal-based emotion recognition in the field of affective computing. For reliability and user-friendly acquisition, electrodermal activity (EDA) has a great advantage in practical applications. However, EDA-based emotion recognition with large-scale subjects is still a tough problem. The traditional well-designed classifiers with hand-crafted features produce poorer results because of their limited representation abilities. And the deep learning models with auto feature extraction suffer the overfitting drop-off because of large-scale individual differences. Since music has a strong correlation with human emotion, static music can be involved as the external benchmark to constrain various dynamic EDA signals. In this article, we make an attempt by fusing the subject’s individual EDA features and the external evoked music features. And we propose an end-to-end multimodal framework, the one-dimensional residual temporal and channel attention network (RTCAN-1D). For EDA features, the channel-temporal attention mechanism for EDA-based emotion recognition is first involved in mine the temporal and channel-wise dynamic and steady features. The comparisons with single EDA-based SOTA models on DEAP and AMIGOS datasets prove the effectiveness of RTCAN-1D to mine EDA features. For music features, we simply process the music signal with the open-source toolkit openSMILE to obtain external feature vectors. We conducted systematic and extensive evaluations. The experiments on the current largest music emotion dataset PMEmo validate that the fusion of EDA and music is a reliable and efficient solution for large-scale emotion recognition.
- [1] . 2019. A deep-learning model for subject-independent human emotion recognition using electrodermal activity sensors. Sensors 19, 7 (2019), 1659.Google Scholar
Cross Ref
- [2] . 2005. Separating individual skin conductance responses in a short interstimulus-interval paradigm. Journal of Neuroscience Methods 146, 1 (2005), 116–123.Google Scholar
Cross Ref
- [3] . 2017. Developing a benchmark for emotional analysis of music. PloS One 12, 3 (2017), e0173392.Google Scholar
Cross Ref
- [4] . 2018. Dry electrode optimization for wrist-based electrodermal activity monitoring. In Proceedings of the 2018 IEEE International Symposium on Medical Measurements and Applications. IEEE, 1–6.Google Scholar
Digital Library
- [5] . 2018. Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 2 (2018), 423–443.Google Scholar
Digital Library
- [6] . 2004. Deep Listeners: Music, Emotion, and Trancing, Vol. 1. Indiana University Press.Google Scholar
- [7] . 2010. A continuous measure of phasic electrodermal activity. Journal of Neuroscience Methods 190, 1 (2010), 80–91.Google Scholar
Cross Ref
- [8] . 2010. Decomposition of skin conductance data by means of nonnegative deconvolution. Psychophysiology 47, 4 (2010), 647–658.Google Scholar
- [9] . 2012. Electrodermal Activity. Springer Science & Business Media.Google Scholar
Cross Ref
- [10] . 2005. A non-local algorithm for image denoising. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. IEEE, 60–65.Google Scholar
Digital Library
- [11] . 2016. Hybrid music recommender using content-based and social information. In Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2618–2622.Google Scholar
Digital Library
- [12] . 2018. Amigos: A dataset for affect, personality and mood research on individuals and groups. IEEE Transactions on Affective Computing 12, 2 (2018), 479–493.Google Scholar
- [13] . 1994. Descartes’ error: Emotion, reason, and the human brain. Optometry and Vision Science 72, 11 (1994), 847–848.Google Scholar
- [14] . 2007. The electrodermal system. Handbook of Psychophysiology 2 (2007), 200–223.Google Scholar
- [15] . 1992. An argument for basic emotions. Cognition & Emotion 6, 3–4 (1992), 169–200.Google Scholar
Cross Ref
- [16] . 2011. Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition 44, 3 (2011), 572–587.Google Scholar
Digital Library
- [17] . 2010. On the classification of emotional biosignals evoked while viewing affective pictures: An integrated data-mining-based approach for healthcare applications. IEEE Transactions on Information Technology in Biomedicine 14, 2 (2010), 309–318.Google Scholar
Digital Library
- [18] . 2020. Convolutional neural network based emotion classification using electrodermal activity signals and time-frequency features. Expert Systems with Applications 159 (2020), 113571.Google Scholar
Cross Ref
- [19] . 2016. Arousal and valence recognition of affective sounds based on electrodermal activity. IEEE Sensors Journal 17, 3 (2016), 716–725.Google Scholar
Cross Ref
- [20] . 2015. cvxEDA: A convex optimization approach to electrodermal activity processing. IEEE Transactions on Biomedical Engineering 63, 4 (2015), 797–804.Google Scholar
- [21] . 2013. Pervasive and unobtrusive emotion sensing for human mental health. In Proceedings of the 2013 7th International Conference on Pervasive Computing Technologies for Healthcare and Workshops. IEEE, 436–439.Google Scholar
Digital Library
- [22] . 2012. Mapping discrete and dimensional emotions onto the brain: Controversies and consensus. Trends in Cognitive Sciences 16, 9 (2012), 458–466.Google Scholar
Cross Ref
- [23] . 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google Scholar
Cross Ref
- [24] . 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7132–7141.Google Scholar
Cross Ref
- [25] . 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700–4708.Google Scholar
Cross Ref
- [26] . 2007. Basic emotions, natural kinds, emotion schemas, and a new paradigm. Perspectives on Psychological Science 2, 3 (2007), 260–280.Google Scholar
Cross Ref
- [27] . 2007. Multimodal human–computer interaction: A survey. Computer Vision and Image Understanding 108, 1–2 (2007), 116–134.Google Scholar
Digital Library
- [28] . 2011. Physiological signals based human emotion recognition: A review. In Proceedings of the 2011 IEEE 7th International Colloquium on Signal Processing and its Applications. IEEE, 410–415.Google Scholar
Cross Ref
- [29] . 2008. Toward emotion recognition in car-racing drivers: A biosignal processing approach. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 38, 3 (2008), 502–512.Google Scholar
Digital Library
- [30] . 2018. Applications of sparse recovery and dictionary learning to enhance analysis of ambulatory electrodermal activity data. Biomedical Signal Processing and Control 40, 2 (2018), 58–70.Google Scholar
Cross Ref
- [31] . 2007. Bimodal emotion recognition using speech and physiological changes. Robust Speech Recognition and Understanding 265 (2007), 280.Google Scholar
- [32] . 2008. Emotion recognition based on physiological changes in music listening. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 12 (2008), 2067–2083.Google Scholar
Digital Library
- [33] . 2015. Emotion recognition during speech using dynamics of multiple regions of the face. ACM Transactions on Multimedia Computing, Communications, and Applications 12, 1s (2015), 1–23.Google Scholar
Digital Library
- [34] . 2011. Deap: A database for emotion analysis; using physiological signals. IEEE Transactions on Affective Computing 3, 1 (2011), 18–31.Google Scholar
Digital Library
- [35] . 2013. Fusion of facial expressions and EEG for implicit affective tagging. Image and Vision Computing 31, 2 (2013), 164–174.Google Scholar
Digital Library
- [36] . 1993. Looking at pictures: Affective, facial, visceral, and behavioral reactions. Psychophysiology 30, 3 (1993), 261–273.Google Scholar
Cross Ref
- [37] . 2011. Stroke rehabilitation. The Lancet 377, 9778 (2011), 1693–1702.Google Scholar
Cross Ref
- [38] . 2000. Overfitting and neural networks: Conjugate gradient and backpropagation. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, Vol. 1. IEEE, 114–119.Google Scholar
Cross Ref
- [39] . 1997. Lessons in neural network training: Overfitting may be harder than expected. In Proceedings of the AAAI/IAAI. Citeseer, 540–545.Google Scholar
- [40] . 2017. Deep convolutional neural network for emotion recognition using EEG and peripheral physiological signal. In Proceedings of the International Conference on Image and Graphics. Springer, 385–394.Google Scholar
Cross Ref
- [41] . 2011. Exploiting online music tags for music emotion classification. ACM Transactions on Multimedia Computing, Communications, and Applications 7, 1 (2011), 1–16.Google Scholar
Digital Library
- [42] . 2017. Multi-modal emotion recognition with temporal-band attention based on lstm-rnn. In Proceedings of the Pacific Rim Conference on Multimedia. Springer, 194–204.Google Scholar
- [43] . 2020. PhysioSkin: Rapid fabrication of skin-conformal physiological interfaces. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–10.Google Scholar
Digital Library
- [44] . 2017. Temporal attention-gated model for robust sequence classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6730–6739.Google Scholar
Cross Ref
- [45] . 2000. Affective Computing. MIT press.Google Scholar
Digital Library
- [46] . 2001. Toward machine emotional intelligence: Analysis of affective physiological state. IEEE Transactions on Pattern Analysis & Machine Intelligence23, 10 (2001), 1175–1191.Google Scholar
Digital Library
- [47] . 1982. A psychoevolutionary theory of emotions. Social Science Information 21 (1982), 529–553.Google Scholar
Cross Ref
- [48] . 1980. A circumplex model of affect. Journal of Personality and Social Psychology 39, 6 (1980), 1161.Google Scholar
Cross Ref
- [49] . 2013. Stress recognition using wearable sensors and mobile phones. In Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction. IEEE, 671–676.Google Scholar
Digital Library
- [50] . 2018. Using deep convolutional neural network for emotion detection on a physiological signals dataset. IEEE Access 7 (2018), 57–67.Google Scholar
Cross Ref
- [51] . 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision. 618–626.Google Scholar
Cross Ref
- [52] . 2019. Audio-video emotional response mapping based upon electrodermal activity. Biomedical Signal Processing and Control 47 (2019), 324–333.Google Scholar
Cross Ref
- [53] . 2019. Feature extraction and selection for emotion recognition from electrodermal activity. IEEE Transactions on Affective Computing 12, 4 (2019), 857–869.Google Scholar
- [54] . 2014. Very deep convolutional networks for large-scale image recognition. In the 3rd International Conference on Learning Representations (ICLR’15).Google Scholar
- [55] . 2014. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929–1958.Google Scholar
Digital Library
- [56] . 2017. Multimodal fusion of eeg and musical features in music-emotion recognition. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.Google Scholar
Cross Ref
- [57] . 2013. Feature selection for multimodal emotion recognition in the arousal-valence space. In Proceedings of the 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 4330–4333.Google Scholar
Cross Ref
- [58] . 2017. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems. 5998–6008.Google Scholar
- [59] . 2018. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7794–7803.Google Scholar
Cross Ref
- [60] . 1921. Vorlesungen über die menschen- und tierseele. American Journal of Psychology 32, 1 (1921), 151.Google Scholar
Cross Ref
- [61] . 2002. Multi-sensor management for information fusion: Issues and approaches. Information Fusion 3, 2 (2002), 163–186.Google Scholar
Cross Ref
- [62] . 2019. User independent emotion recognition with residual signal-image network. In Proceedings of the 2019 IEEE International Conference on Image Processing. IEEE, 3277–3281.Google Scholar
Cross Ref
- [63] . 2018. The PMEmo dataset for music emotion recognition. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval. ACM, 135–142.Google Scholar
Digital Library
Index Terms
A Multimodal Framework for Large-Scale Emotion Recognition by Fusing Music and Electrodermal Activity Signals
Recommendations
Emotion Recognition in Conversations Using Brain and Physiological Signals
IUI '22: 27th International Conference on Intelligent User InterfacesEmotions are complicated psycho-physiological processes that are related to numerous external and internal changes in the body. They play an essential role in human-human interaction and can be important for human-machine interfaces. Automatically ...
Human emotion recognition and analysis in response to audio music using brain signals
Human emotion recognition using brain signals is an active research topic in the field of affective computing. Music is considered as a powerful tool for arousing emotions in human beings. This study recognized happy, sad, love and anger emotions in ...
Music Emotion Recognition: From Content- to Context-Based Models
CMMR 2012: Revised Selected Papers of the 9th International Symposium on From Sounds to Music and Emotions - Volume 7900The striking ability of music to elicit emotions assures its prominent status in human culture and every day life. Music is often enjoyed and sought for its ability to induce or convey emotions, which may manifest in anything from a slight variation in ...






Comments