Abstract
Detection of aesthetic highlights is a challenge for understanding the affective processes taking place during movie watching. In this article, we study spectators’ responses to movie aesthetic stimuli in a social context. Moreover, we look for uncovering the emotional component of aesthetic highlights in movies. Our assumption is that synchronized spectators’ physiological and behavioral reactions occur during these highlights because: (i) aesthetic choices of filmmakers are made to elicit specific emotional reactions (e.g., special effects, empathy, and compassion toward a character) and (ii) watching a movie together causes spectators’ affective reactions to be synchronized through emotional contagion. We compare different approaches to estimation of synchronization among multiple spectators’ signals, such as pairwise, group, and overall synchronization measures to detect aesthetic highlights in movies. The results show that the unsupervised architecture relying on synchronization measures is able to capture different properties of spectators’ synchronization and detect aesthetic highlights based on both spectators’ electrodermal and acceleration signals. We discover that pairwise synchronization measures perform the most accurately independently of the category of the highlights and movie genres. Moreover, we observe that electrodermal signals have more discriminative power than acceleration signals for highlight detection.
- Nicola Ancona, Daniele Marinazzo, and Sebastiano Stramaglia. 2004. Radial basis function approach to nonlinear Granger causality of time series. Physical Review E 70, 5 (2004), 056221.Google Scholar
Cross Ref
- Selin Aviyente. 2005. A measure of mutual information on the time-frequency plane. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. (ICASSP’05). IEEE, IV--481.Google Scholar
- Yoann Baveye, Emmanuel Dellandréa, Christel Chamaret, and Liming Chen. 2015. Deep learning vs. kernel methods: Performance for emotion prediction in videos. In Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, 77--83. Google Scholar
Digital Library
- André Bazin. 2004. What is Cinema? University of California Press.Google Scholar
- Donald J. Berndt and James Clifford. 1994. Using dynamic time warping to find patterns in time series. In Proceedings of the KDD Workshop, Vol. 10. 359--370. Google Scholar
Digital Library
- Katarzyna J. Blinowska, Rafał Kuś, and Maciej Kamiński. 2004. Granger causality and information flow in multivariate processes. Physical Review E 70, 5 (2004), 050902.Google Scholar
Cross Ref
- David Bordwell, Kristin Thompson, and Jeremy Ashton. 1997. Film Art: An Introduction. McGraw-Hill New York.Google Scholar
- Michael Borenstein, Larry V. Hedges, Julian Higgins, and Hannah R. Rothstein. 2009. Introduction to Meta-analysis. John Wiley 8 Sons, Inc.Google Scholar
- Andrew Bradley. 1997. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 7 (1997), 1145--1159. Google Scholar
Digital Library
- Cristian Carmeli, Maria G. Knyazeva, Giorgio M. Innocenti, and Oscar De Feo. 2005. Assessment of EEG synchronization based on state-space analysis. Neuroimage 25, 2 (2005), 339--354.Google Scholar
- Stanley Cavell. 1979. The World Viewed: Reflections on the Ontology of Film. Harvard University Press, 280 pages.Google Scholar
- Liang-Hua Chen, Hsi-Wen Hsu, Li-Yun Wang, and Chih-Wen Su. 2011. Violence detection in movies. In Proceedings of the 2011 8th International Conference on Computer Graphics, Imaging and Visualization (CGIV’11). IEEE, 119--124. Google Scholar
Digital Library
- Yonghong Chen, Govindan Rangarajan, Jianfeng Feng, and Mingzhou Ding. 2004. Analyzing multiple nonlinear time series with extended Granger causality. Physics Letters A 324, 1 (2004), 26--35.Google Scholar
Cross Ref
- Christophe Chênes, Guillaume Chanel, Mohammad Soleymani, and Thierry Pun. 2013. Highlight detection in movie scenes through inter-users, physiological linkage. In Social Media Retrieval. 217--237.Google Scholar
- Sofya Chepushtanova, Michael Kirby, Chris Peterson, and Lori Ziegelmeier. 2015. An application of persistent homology on Grassmann manifolds for the detection of signals in hyperspectral imagery. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS’15). IEEE, 449--452.Google Scholar
Cross Ref
- Jocob Cohen. 1988. Statistical Power Analysis for the Behavioral Sciences. Erlbaum, Hillsdale, NJ.Google Scholar
- Ronald R. Coifman and Stéphane Lafon. 2006. Diffusion maps. Applied and Computational Harmonic Analysis 21, 1 (2006), 5--30.Google Scholar
Cross Ref
- Thomas M. Cover and Joy A. Thomas. 1991. Elements of Information Theory. John Wiley 8 Sons, Inc. Google Scholar
Digital Library
- Mihaly Csikszentmihalyi. 2000. Beyond Boredom and Anxiety. Jossey-Bass.Google Scholar
- Mihaly Csikszentmihalyi. 2014. Toward a Psychology of Optimal Experience. Springer.Google Scholar
- Dong Cui, Xianzeng Liu, You Wan, and Xiaoli Li. 2010. Estimation of genuine and random synchronization in multivariate neural series. Neural Networks 23, 6 (2010), 698--704. Google Scholar
Digital Library
- Gerald C. Cupchik, Oshin Vartanian, Adrian Crawley, and David J. Mikulis. 2009. Viewing artworks: Contributions of cognitive control and perceptual facilitation to aesthetic experience. Brain and Cognition 70, 1 (2009), 84--91.Google Scholar
Cross Ref
- Justin Dauwels, François Vialatte, Toshimitsu Musha, and Andrzej Cichocki. 2010. A comparative study of synchrony measures for the early diagnosis of Alzheimer’s disease based on EEG. NeuroImage 49, 1 (2010), 668--693.Google Scholar
Cross Ref
- Justin Dauwels, François B. Vialatte, Tomasz M. Rutkowski, and Andrzej Cichocki. 2007. Measuring neural synchrony by message passing. In NIPS. 361--368. Google Scholar
Digital Library
- Bordwell David and Kristin Thompson. 1994. Film History: An Introduction. McGraw-Hill, 857 pages.Google Scholar
- Gilles Deleuze. 1989. Cinema 2: The time-image, trans. Hugh Tomlinson and Robert Galeta. London: Athlone (1989).Google Scholar
- Gilles Deleuze, Hugh Tomlinson, and Barbara Habberjam. 1986. The Movement-Image. University of Minnesota.Google Scholar
- Florian Eyben, Felix Weninger, Stefano Squartini, and Björn Schuller. 2013. Real-life voice activity detection with LSTM recurrent neural networks and an application to Hollywood movies. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’13). IEEE, 483--487.Google Scholar
Cross Ref
- Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recognition Letters 27, 8 (2006), 861--874. Google Scholar
Digital Library
- Julien Fleureau, Philippe Guillotel, and Izabela Orlac. 2013. Affective benchmarking of movies based on the physiological responses of a real audience. In Proceedings of the 2013 Humane Association Conference on Affective Computing and Intelligent Interaction (ACII’13). IEEE, 73--78. Google Scholar
Digital Library
- Pouya Ghaemmaghami, Mojtaba Khomami Abadi, Seyed Mostafa Kia, Paolo Avesani, and Nicu Sebe. 2015. Movie genre classification by exploiting MEG brain signals. In International Conference on Image Analysis and Processing. Springer, 683--693.Google Scholar
Cross Ref
- Robert Ghrist. 2008. Barcodes: The persistent topology of data. Bull. Amer. Math. Soc. 45, 1 (2008), 61--75.Google Scholar
Cross Ref
- Gabin Gninkoun and Mohammad Soleymani. 2011. Automatic violence scenes detection: A multi-modal approach. In Working Notes Proceedings of the MediaEval 2011 Workshop.Google Scholar
- Yulia Golland, Yossi Arzouan, and Nava Levit-Binnun. 2015. The mere co-presence: Synchronization of autonomic signals and emotional responses across co-present individuals not engaged in direct interaction. PloS One 10, 5 (2015), e0125804.Google Scholar
Cross Ref
- Clive W. J. Granger. 1969. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society 37, 3 (1969), 424--438.Google Scholar
Cross Ref
- Aysegul Gunduz and Jose C. Principe. 2009. Correntropy as a novel measure for nonlinearity tests. Signal Processing 89, 1 (2009), 14--23. Google Scholar
Digital Library
- Jihun Hamm and Daniel D. Lee. 2008. Grassmann discriminant analysis: A unifying view on subspace-based learning. In Proceedings of the 25th International Conference on Machine Learning. ACM, 376--383. Google Scholar
Digital Library
- Alan Hanjalic and Li-Qun Xu. 2005. Affective video content representation and modeling. IEEE Transactions on Multimedia 7, 1 (2005), 143--154. Google Scholar
Digital Library
- Elaine Hatfield, John T. Cacioppo, and Richard L. Rapson. 1994. Emotional Contagion. Cambridge University Press.Google Scholar
- Mahdi Jalili, Elham Barzegaran, and Maria G. Knyazeva. 2014. Synchronization of EEG: Bivariate and multivariate measures. IEEE Transactions on Neural Systems and Rehabilitation Engineering 22, 2 (2014), 212--221.Google Scholar
Cross Ref
- B. Jelles, P. H. Scheltens, W. M. Van der Flier, E. J. Jonkman, F. H. Lopes da Silva, and C. J. Stam. 2008. Global dynamical analysis of the EEG in Alzheimer’s disease: Frequency-specific changes of functional interactions. Clinical Neurophysiology 119, 4 (2008), 837--841.Google Scholar
Cross Ref
- Hideo Joho, Jacopo Staiano, Nicu Sebe, and Joemon M. Jose. 2011. Looking at the viewer: Analysing facial activity to detect personal highlights of multimedia contents. Multimedia Tools and Applications 51, 2 (2011), 505--523. Google Scholar
Digital Library
- Patrik N. Juslin. 2013. From everyday emotions to aesthetic emotions: Towards a unified theory of musical emotions. Physics of Life Reviews 10, 3 (2013), 235--266.Google Scholar
Cross Ref
- Hang-Bong Kang. 2003. Affective content detection using HMMs. In Proceedings of the 11th ACM International Conference on Multimedia. ACM, 259--262. Google Scholar
Digital Library
- Michael Kipp. 2014. ANVIL: The video annotation research tool. In Handbook of Corpus Phonology, J. Durand, U. Gut, and G. Kristoffersen (Eds.). Oxford University Press, Chapter 21, 420--436.Google Scholar
Cross Ref
- Sander Koelstra, Christian Muhl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Yazdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. 2012. DEAP: A database for emotion analysis; using physiological signals. IEEE Transactions on Affective Computing 3, 1 (2012), 18--31. Google Scholar
Digital Library
- Arthur Koestler. 1970. The act of creation. Pan Macmillan, 504 pages.Google Scholar
- Theodoros Kostoulas, Guillaume Chanel, Michal Muszynski, Patrizia Lombardo, and Thierry Pun. 2015. Dynamic time warping of multimodal signals for detecting highlights in movies. In Proceedings of the 1st Workshop on Modeling INTERPERsonal SynchrONy And Influence. ACM, 35--40. Google Scholar
Digital Library
- Theodoros Kostoulas, Guillaume Chanel, Michal Muszynski, Patrizia Lombardo, and Thierry Pun. 2015. Identifying aesthetic highlights in movies from clustering of physiological and behavioral signals. In Proceedings of the 2015 7th International Workshop on Quality of Multimedia Experience (QoMEX’15).Google Scholar
Cross Ref
- Theodoros Kostoulas, Guillaume Chanel, Michal Muszynski, Patrizia Lombardo, and Thierry Pun. 2017. Films, affective computing and aesthetic experience: Identifying emotional and aesthetic highlights from multimodal signals in a social setting.Frontiers in ICT 4 (2017), 1--11.Google Scholar
- Alexander Kraskov, Harald Stögbauer, and Peter Grassberger. 2004. Estimating mutual information. Physical Review E 69, 6 (2004), 066138.Google Scholar
Cross Ref
- Eleni Kroupi, Jean-Marc Vesin, and Touradj Ebrahimi. 2013. Phase-amplitude coupling between EEG and EDA while experiencing multimedia content. In Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on. IEEE, 865--870. Google Scholar
Digital Library
- Jean-Philippe Lachaux, Eugenio Rodriguez, Jacques Martinerie, Francisco J. Varela, and others. 1999. Measuring phase synchrony in brain signals. Human Brain Mapping 8, 4 (1999), 194--208.Google Scholar
- Ting Li, Yoann Baveye, Christel Chamaret, Emmanuel Dellandréa, and Liming Chen. 2015. Continuous arousal self-assessments validation using real-time physiological responses. In Proceedings of the 1st International Workshop on Affect 8 Sentiment in Multimedia. ACM, 39--44. Google Scholar
Digital Library
- Wu Liu, Tao Mei, Yongdong Zhang, Cherry Che, and Jiebo Luo. 2015. Multi-task deep visual-semantic embedding for video thumbnail selection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3707--3715.Google Scholar
Cross Ref
- Slobodan Marković. 2012. Components of aesthetic experience: Aesthetic fascination, aesthetic appraisal, and aesthetic emotion. i-Perception 3, 1 (2012), 1--17.Google Scholar
- Abraham H. Maslow. 2013. Toward a Psychology of Being. Simon and Schuster.Google Scholar
- Meinard Müller. 2007. Information Retrieval for Music and Motion. Vol. 2. Springer. Google Scholar
Digital Library
- Michal Muszynski, Theodoros Kostoulas, Guillaume Chanel, Patrizia Lombardo, and Thierry Pun. 2015. Spectators’ synchronization detection based on manifold representation of physiological signals: Application to movie highlights detection. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. ACM, 235--238. Google Scholar
Digital Library
- Michal Muszynski, Theodoros Kostoulas, Patrizia Lombardo, Thierry Pun, and Guillaume Chanel. 2016. Synchronization among groups of spectators for highlight detection in movies. In Proceedings of the 2016 ACM on Multimedia Conference. ACM, 292--296. Google Scholar
Digital Library
- Paul L. Nunez and Ramesh Srinivasan. 2006. Electric Fields of the Brain: The Neurophysics of EEG. Oxford University Press.Google Scholar
- Cédric Penet, Claire-Hélène Demarty, Guillaume Gravier, and Patrick Gros. 2015. Variability modelling for audio events detection in movies. Multimedia Tools and Applications 74, 4 (2015), 1143--1173. Google Scholar
Digital Library
- Jose A. Perea, Anastasia Deckard, Steve B. Haase, and John Harer. 2015. SW1PerS: Sliding windows and 1-persistence scoring; discovering periodicity in gene expression time series data. BMC Bioinformatics 16, 1 (2015), 257.Google Scholar
Cross Ref
- Jose A. Perea and John Harer. 2015. Sliding windows and persistence: An application of topological methods to signal analysis. Foundations of Computational Mathematics 15, 3 (2015), 799--838. Google Scholar
Digital Library
- Ernesto Pereda, Rodrigo Quian Quiroga, and Joydeep Bhattacharya. 2005. Nonlinear multivariate analysis of neurophysiological signals. Progress in Neurobiology 77, 1 (2005), 1--37.Google Scholar
Cross Ref
- Rodrigo Quian Quiroga, Jochen Arnhold, and Peter Grassberger. 2000. Learning driver-response relationships from synchronization patterns. Physical Review E 61, 5 (2000), 5142.Google Scholar
Cross Ref
- Nikolai F. Rulkov, Mikhail M. Sushchik, Lev S. Tsimring, and Henry D. I. Abarbanel. 1995. Generalized synchronization of chaos in directionally coupled chaotic systems. Physical Review E 51, 2 (1995), 980.Google Scholar
Cross Ref
- Naomi Saito, Toshiaki Kuginuki, Takami Yagyu, Toshihiko Kinoshita, Thomas Koenig, Roberto D. Pascual-Marqui, Kieko Kochi, Jiri Wackermann, and Dietrich Lehmann. 1998. Global, regional, and local measures of complexity of multichannel electroencephalography in acute, neuroleptic-naive, first-break schizophrenics. Biological Psychiatry 43, 11 (1998), 794--802.Google Scholar
Cross Ref
- Klaus R. Scherer. 2005. What are emotions? And how can they be measured? Social Science Information 44, 4 (2005), 695--729.Google Scholar
- Mohamad-Hoseyn Sigari, Hamid Soltanian-Zadeh, and Hamid-Reza Pourreza. 2015. Fast highlight detection and scoring for broadcast soccer video summarization using on-demand feature extraction and fuzzy inference. International Journal of Computer Graphics 6, 1 (2015), 13--36.Google Scholar
Cross Ref
- Mohammad Soleymani, Sadjad Asghari-Esfeden, Maja Pantic, and Yun Fu. 2014. Continuous emotion detection using EEG signals and facial expressions. In Proceedings of the 2014 IEEE International Conference on Multimedia and Expo (ICME’14). IEEE, 1--6.Google Scholar
Cross Ref
- Mohammad Soleymani, Guillaume Chanel, Joep J. M. Kierkels, and Thierry Pun. 2008. Affective ranking of movie scenes using physiological signals and content analysis. In Proceedings of the 2nd ACM Workshop on Multimedia Semantics. ACM, 32--39. Google Scholar
Digital Library
- Floris Takens. 1981. Detecting strange attractors in turbulence. Dynamical Systems and Turbulence, Lecture Notes in Mathematics 898 (1981), 366--381.Google Scholar
Cross Ref
- Jussi Tarvainen, Mats Sjöberg, Stina Westman, Jorma Laaksonen, and Pirkko Oittinen. 2014. Content-based prediction of movie style, aesthetics, and affect: Data set and baseline experiments. IEEE Transactions on Multimedia 16, 8 (2014), 2085--2098.Google Scholar
Cross Ref
- Auke Tellegen and Gilbert Atkinson. 1974. Openness to absorbing and self-altering experiences (“absorption”), a trait related to hypnotic susceptibility.Journal of Abnormal Psychology 83, 3 (1974), 268.Google Scholar
Cross Ref
- Hee Lin Wang and Loong-Fah Cheong. 2006. Affective understanding in film. IEEE Transactions on Circuits and Systems for Video Technology 16, 6 (June 2006), 689--704. Google Scholar
Digital Library
- Min Xu, L.-T. Chia, and Jesse Jin. 2005. Affective content analysis in comedy and horror videos by audio emotional event detection. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME’05). IEEE, 4--pp.Google Scholar
- Huan Yang, Baoyuan Wang, Stephen Lin, David Wipf, Minyi Guo, and Baining Guo. 2015. Unsupervised extraction of video highlights via robust recurrent auto-encoders. In Proceedings of the IEEE International Conference on Computer Vision. 4633--4641. Google Scholar
Digital Library
- Ting Yao, Tao Mei, and Yong Rui. 2016. Highlight detection with pairwise deep ranking for first-person video summarization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 982--990.Google Scholar
Cross Ref
- Feng Zhou, Xingda Qu, Jianxin Roger Jiao, and Martin G. Helander. 2014. Emotion prediction from physiological signals: A comparison study between visual and auditory elicitors. Interacting With Computers 26, 3 (2014), 285--302.Google Scholar
Cross Ref
Index Terms
Aesthetic Highlight Detection in Movies Based on Synchronization of Spectators’ Reactions
Recommendations
Synchronization among Groups of Spectators for Highlight Detection in Movies
MM '16: Proceedings of the 24th ACM international conference on MultimediaDetection of emotional and aesthetic highlights is a challenge for the affective understanding of movies. Our assumption is that synchronized spectators' physiological and behavioral reactions occur during these highlights. We propose to employ the ...
Dynamic Time Warping of Multimodal Signals for Detecting Highlights in Movies
INTERPERSONAL '15: Proceedings of the 1st Workshop on Modeling INTERPERsonal SynchrONy And infLuenceAffective computing has strong ties with literature and film studies, e.g. text sentiment analysis, affective tagging of movies. In this work we report on recent findings towards identifying highlights in movies on the basis of the synchronization of ...
Spectators' Synchronization Detection based on Manifold Representation of Physiological Signals: Application to Movie Highlights Detection
ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal InteractionDetection of highlights in movies is a challenge for the affective understanding and implicit tagging of films. Under the hypothesis that synchronization of the reaction of spectators indicates such highlights, we define a synchronization measure ...






Comments