Abstract
Emotion recognition in conversation is one of the essential tasks of natural language processing. However, this task’s annotation data is insufficient since such data is hard to collect and annotate. Meanwhile, there is large-scale data for conversational generation, and this data does not need annotation manually. But, whether the vector space between different datasets is similar will be a problem. Therefore, we utilize a same dataset to train the conversational generator and the classifier, and transfer knowledge between them. In particular, we propose an Emotion Recognition with Conversational Generation Transfer (ERCGT) framework to model the interaction among utterances by transfer learning. First, we train a conversational generator. In the second step, a transfer learning model is used to transfer the knowledge of generator to the emotion recognition model. Empirical studies illustrate the effectiveness of the proposed framework over several strong baselines on three benchmark emotion classification datasets.
- [1] . 2019. Multimodal and multi-view models for emotion recognition. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 991–1002. https://doi.org/10.18653/v1/P19-1095Google Scholar
Cross Ref
- [2] . 2008. IEMOCAP: Interactive emotional dyadic motion capture database. Language Resources & Evaluation 42, 4 (2008), 335–359.Google Scholar
Cross Ref
- [3] . 2016. Word embeddings and convolutional neural network for Arabic sentiment classification. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, 2418–2427.Google Scholar
- [4] . 2011. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs. In Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics. Association for Computational Linguistics, Portland, Oregon, USA, 76–87. Google Scholar
Digital Library
- [5] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423Google Scholar
- [6] . 2018. Wizard of Wikipedia: Knowledge-powered conversational agents. arXiv preprint arXiv:1811.01241 (2018).Google Scholar
- [7] . 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12, Jul (2011), 2121–2159. Google Scholar
Digital Library
- [8] . 2017. Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1615–1625. https://doi.org/10.18653/v1/D17-1169Google Scholar
Cross Ref
- [9] . 2019. DialogueGCN: A graph convolutional neural network for emotion recognition in conversation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 154–164. https://doi.org/10.18653/v1/D19-1015Google Scholar
Cross Ref
- [10] . 2017. Progressive neural networks for transfer learning in emotion recognition. 1098–1102. https://doi.org/10.21437/Interspeech.2017-1637Google Scholar
- [11] . 2018. Hybrid attention based multimodal network for spoken language classification. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, 2379–2390.Google Scholar
- [12] . 2018. ICON: Interactive conversational memory network for multimodal emotion detection. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 2594–2604. https://doi.org/10.18653/v1/D18-1280Google Scholar
Cross Ref
- [13] . 2018. Conversational memory network for emotion recognition in dyadic dialogue videos. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 2122–2132. https://doi.org/10.18653/v1/N18-1193Google Scholar
Cross Ref
- [14] . 2021. Conversational transfer learning for emotion recognition. Information Fusion 65 (2021), 1–12.Google Scholar
Cross Ref
- [15] . 2018. EmotionLines: An emotion corpus of multi-party conversations. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan, 1597–1601.Google Scholar
- [16] . 2016. Modeling rich contexts for sentiment classification with LSTM. arXiv preprint arXiv:1605.01478 (2016).Google Scholar
- [17] . 2019. EmotionX-IDEA: Emotion BERT – An affectional model for conversation. arXiv preprint arXiv:1908.06264.Google Scholar
- [18] . 2019. PT-CoDE: Pre-trained Context-Dependent Encoder for Utterancelevel Emotion Recognition. arXiv preprint arXiv: 1910.08916 (2019).Google Scholar
- [19] . 2020. EmpDG: Multi-resolution interactive empathetic dialogue generation. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 4454–4466. https://doi.org/10.18653/v1/2020.coling-main.394Google Scholar
Cross Ref
- [20] . 2017. DailyDialog: A manually labelled multi-turn dialogue dataset. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, Taipei, Taiwan, 986–995.Google Scholar
- [21] . 2015. The Ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, Prague, Czech Republic, 285–294. https://doi.org/10.18653/v1/W15-4640Google Scholar
Cross Ref
- [22] . 2019. DialogueRNN: An attentive RNN for emotion detection in conversations. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6818–6825. Google Scholar
Digital Library
- [23] . 2005. Sentiment classification using word sub-sequences and dependency sub-trees. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 301–311. Google Scholar
Digital Library
- [24] . 2000. How emotions work: The social functions of emotional expression in negotiations. Research in Organizational Behavior 22 (2000), 1–50.Google Scholar
Cross Ref
- [25] . 2009. Feature selection and weighting methods in sentiment analysis. In Proceedings of the 14th Australasian Document Computing Symposium, Sydney. Citeseer, 67–74.Google Scholar
- [26] . 2002. Thumbs up? Sentiment classification using machine learning techniques. EMNLP 10 (
06 2002), 79–86. https://doi.org/10.3115/1118693.1118704 Google ScholarDigital Library
- [27] . 2018. A hierarchical latent structure for variational conversation modeling. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 1792–1801. https://doi.org/10.18653/v1/N18-1162Google Scholar
Cross Ref
- [28] . 2017. Context-Dependent sentiment analysis in user-generated videos. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, 873–883. https://doi.org/10.18653/v1/P17-1081Google Scholar
Cross Ref
- [29] . 2019. MELD: A multimodal multi-party dataset for emotion recognition in conversations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 527–536. https://doi.org/10.18653/v1/P19-1050Google Scholar
Cross Ref
- [30] . 2019. Emotion recognition in conversation: Research challenges, datasets, and recent advances. IEEE Access 7 (
May 2019), 100943–100953. https://doi.org/10.1109/ACCESS.2019.2929050Google ScholarCross Ref
- [31] . 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. Proceedings of the AAAI Conference on Artificial Intelligence 31, 1, 3295–3301. Google Scholar
Digital Library
- [32] . 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Vol. 33. 3776–3783. Google Scholar
Digital Library
- [33] . 2019. Attentional neural network for emotion detection in conversations with speaker influence awareness. In Natural Language Processing and Chinese Computing. Springer, Springer International Publishing, Cham, 287–297.Google Scholar
- [34] . 2009. Sentiment analysis of Chinese documents: From sentence to document level. Journal of the American Society for Information Science and Technology 60, 12 (2009), 2474–2487. Google Scholar
Digital Library
- [35] . 2019. Modeling both context- and speaker-sensitive dependence for emotion detection in multi-speaker conversations. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. International Joint Conferences on Artificial Intelligence Organization, 5415–5421. https://doi.org/10.24963/ijcai.2019/752 Google Scholar
Cross Ref
- [36] . 2018. Exploring implicit feedback for open domain conversation generation. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, and (Eds.). AAAI Press, 547–554. Google Scholar
Digital Library
- [37] . 2019. Neural conversation generation with auxiliary emotional supervised models. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 19, 2, Article 19 (
Sept. 2019), 17 pages. https://doi.org/10.1145/3344788 Google ScholarDigital Library
- [38] . 2018. Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (2018), 730–738. Google Scholar
Digital Library
Index Terms
Emotion Recognition with Conversational Generation Transfer
Recommendations
Real-Time Multimodal Emotion Recognition in Conversation for Multi-Party Interactions
ICMI '22: Proceedings of the 2022 International Conference on Multimodal InteractionIn order to improve multi-party social interaction with artificial companions such as robots or virtual agents, real-time Emotion Recognition in Conversation (ERC) is required. In this context, ERC is a challenging task which involves multiple ...
Improving multimodal fusion with Main Modal Transformer for emotion recognition in conversation
AbstractEmotion recognition in conversation (ERC) is essential for developing empathic conversation systems. In conversation, emotions can exist in multiple modalities, i.e., audio, text, and visual. Due to the inherent ...
Highlights- Modal with different representational abilities should be learned differently.
- ...
Dynamic interactive multiview memory network for emotion recognition in conversation
AbstractWhen available, multimodal data is key for enhanced emotion recognition in conversation. Text, audio, and video in dialogues can facilitate and complement each other in analyzing speakers’ emotions. However, it is very challenging to ...
Highlights- This work focuses on the dynamic interactions during information fusion process.






Comments