skip to main content
research-article

Neural Conversation Generation with Auxiliary Emotional Supervised Models

Authors Info & Claims
Published:17 September 2019Publication History
Skip Abstract Section

Abstract

An important aspect of developing dialogue agents involves endowing a conversation system with emotion perception and interaction. Most existing emotion dialogue models lack the adaptability and extensibility of different scenes because of their limitation to require a specified emotion category or their reliance on a fixed emotional dictionary. To overcome these limitations, we propose a neural conversation generation with auxiliary emotional supervised model (nCG-ESM) comprising a sequence-to-sequence (Seq2Seq) generation model and an emotional classifier used as an auxiliary model. The emotional classifier was trained to predict the emotion distributions of the dialogues, which were then used as emotion supervised signals to guide the generation model to generate diverse emotional responses. The proposed nCG-ESM is flexible enough to generate responses with emotional diversity, including specified or unspecified emotions, which can be adapted and extended to different scenarios. We conducted extensive experiments on the popular dataset of Weibo post--response pairs. Experimental results showed that the proposed model was capable of producing more diverse, appropriate, and emotionally rich responses, yielding substantial gains in diversity scores and human evaluations.

References

  1. Elisabeth André, Matthias Rehm, Wolfgang Minker, and Dirk Bühler. 2004. Endowing spoken language dialogue systems with emotional intelligence. In Proceedings of the Affective Dialogue Systems, Tutorial, and Research Workshop, (ADS’04). 178--187.Google ScholarGoogle ScholarCross RefCross Ref
  2. Nabiha Asghar, Pascal Poupart, Jesse Hoey, Xin Jiang, and Lili Mou. 2017. Affective neural response generation. In Proceedings of the 40th European Conference on IR, Advances in Information Retrieval (ECIR'2018). 154--166.Google ScholarGoogle Scholar
  3. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014). arxiv:1409.0473Google ScholarGoogle Scholar
  4. Dan Bohus and Alexander I. Rudnicky. 2005. A principled approach for rejection threshold optimization in spoken dialog systems. In Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH’05). 2781--2784.Google ScholarGoogle Scholar
  5. Joseph L. Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 5 (1971), 378.Google ScholarGoogle ScholarCross RefCross Ref
  6. Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen-tau Yih, and Michel Galley. 2017. A knowledge-grounded neural conversation model. CoRR abs/1702.01932 (2017). arxiv:1702.01932Google ScholarGoogle Scholar
  7. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Zongcheng Ji, Zhengdong Lu, and Hang Li. 2014. An information retrieval approach to short text conversation. CoRR abs/1408.6988 (2014). arxiv:1408.6988Google ScholarGoogle Scholar
  9. Dacher Keltner and Jonathan Haidt. 2001. Social Functions of Emotions in Emotions: Current Issues and Future Directions. Guilford Press, 192--213.Google ScholarGoogle Scholar
  10. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2014). arxiv:1412.6980Google ScholarGoogle Scholar
  11. Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016a. A Diversity-promoting objective function for neural conversation models. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’16). 110--119.Google ScholarGoogle ScholarCross RefCross Ref
  12. Jiwei Li, Will Monroe, and Dan Jurafsky. 2016b. A simple, fast diverse decoding algorithm for neural generation. CoRR abs/1611.08562 (2016). arxiv:1611.08562Google ScholarGoogle Scholar
  13. Jiwei Li, Will Monroe, Alan Ritter, Dan Jurafsky, Michel Galley, and Jianfeng Gao. 2016c. Deep reinforcement learning for dialogue generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 1192--1202.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, and Dan Jurafsky. 2017. Adversarial learning for neural dialogue generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2157--2169.Google ScholarGoogle ScholarCross RefCross Ref
  15. Zhouhan Lin, Minwei Feng, Cícero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. CoRR abs/1703.03130 (2017). arxiv:1703.03130Google ScholarGoogle Scholar
  16. Tomas Mikolov, Martin Karafiát, Lukás Burget, Jan Cernocký, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH’10). 1045--1048.Google ScholarGoogle ScholarCross RefCross Ref
  17. Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 27th Annual Conference on Advances in Neural Information Processing Systems. 3111--3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. 2016. Sequence to backward and forward sequences: A content-introducing approach to generative short-text conversation. In Proceedings of the 26th International Conference on Computational Linguistics (COLING'16). 3349--3358.Google ScholarGoogle Scholar
  19. Yehong Peng, Yizhen Fang, Zhiwen Xie, and Guangyou Zhou. 2019. Topic-enhanced emotional conversation generation with attention mechanism. Knowl.-Based Syst. 163 (2019), 429--437.Google ScholarGoogle ScholarCross RefCross Ref
  20. Michal Ptaszynski, Pawel Dybala, Wenhan Shi, Rafal Rzepka, and Kenji Araki. 2009. Towards context aware emotional intelligence in machines: Computing contextual appropriateness of affective states. In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI’09). 1469--1474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Alan Ritter, Colin Cherry, and William B. Dolan. 2011. Data-driven response generation in social media. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 583--593. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2015. Hierarchical neural network generative models for movie dialogues. CoRR abs/1507.04808 (2015). arxiv:1507.04808Google ScholarGoogle Scholar
  23. Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 3776--3784. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron C. Courville, and Yoshua Bengio. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 3295--3301. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural responding machine for short-text conversation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 1577--1586.Google ScholarGoogle Scholar
  26. Louis Shao, Stephan Gouws, Denny Britz, Anna Goldie, Brian Strope, and Ray Kurzweil. 2017. Generating long and diverse responses with neural conversation models. CoRR abs/1701.03185 (2017). arxiv:1701.03185Google ScholarGoogle Scholar
  27. Marcin Skowron. 2009. Affect listeners: Acquisition of affective states by means of conversational systems. In Proceedings of the Development of Multimodal Interfaces: Active Listening and Synchrony. 169--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Marcin Skowron, Stefan Rank, Mathias Theunis, and Julian Sienkiewicz. 2011. The good, the bad and the neutral: Affective profile in dialog system-user communication. In Proceedings of the 4th International Conference of Affective Computing and Intelligent Interaction (ACII’11). 337--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. 2015. A neural network approach to context-sensitive generation of conversational responses. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’15). 196--205.Google ScholarGoogle ScholarCross RefCross Ref
  30. Xiao Sun, Xiaoqi Peng, and Shuai Ding. 2017. Emotional human-machine conversation generation based on long short-term memory. Cogn. Comput. 10, 3 (2017), 389--397.Google ScholarGoogle ScholarCross RefCross Ref
  31. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Annual Conference on Neural Information Processing Systems. 3104--3112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Chongyang Tao, Lili Mou, Dongyan Zhao, and Rui Yan. 2017. RUBER: An unsupervised method for automatic evaluation of open-domain dialog systems. CoRR abs/1701.03079 (2017). arxiv:1701.03079Google ScholarGoogle Scholar
  33. Zhiliang Tian, Rui Yan, Lili Mou, Yiping Song, Yansong Feng, and Dongyan Zhao. 2017. How to make context more useful? an empirical study on context-aware neural conversational models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL’17). 231--236.Google ScholarGoogle ScholarCross RefCross Ref
  34. Ashwin K. Vijayakumar, Michael Cogswell, Ramprasath R. Selvaraju, Qing Sun, Stefan Lee, David J. Crandall, and Dhruv Batra. 2016. Diverse beam search: Decoding diverse solutions from neural sequence models. CoRR abs/1610.02424 (2016). arxiv:1610.02424Google ScholarGoogle Scholar
  35. Hao Wang, Zhengdong Lu, Hang Li, and Enhong Chen. 2013. A dataset for research on short-text conversations. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 935--945.Google ScholarGoogle Scholar
  36. Jason D. Williams and Steve J. Young. 2007. Partially observable markov decision processes for spoken dialog systems. Comput. Speech Lang. 21, 2 (2007), 393--422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Zhiwen Xie, Zhao Zeng, Guangyou Zhou, and Weijun Wang. 2017. Topic enhanced deep structured semantic models for knowledge base question answering. Inf. Sci. 60, 11 (2017), 110103:1--110103:15.Google ScholarGoogle Scholar
  38. Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. 2017. Topic aware neural response generation. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 3351--3357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Rui Yan, Dongyan Zhao, and Weinan E. 2017. Joint learning of response ranking and next utterance suggestion in human-computer conversation system. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 685--694. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Rui Zhang, Zhenyu Wang, and Dongcheng Mai. 2017. Building emotional conversation systems using multi-task seq2seq learning. In Proceedings of the 6th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC’17). 612--621.Google ScholarGoogle Scholar
  41. Guangyou Zhou, Tingting He, Jun Zhao, and Po Hu. 2015. Learning continuous word embedding with metadata for question retrieval in community question answering. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 250--259. http://aclweb.org/anthology/P/P15/P15-1025.pdf.Google ScholarGoogle Scholar
  42. Guangyou Zhou and Jimmy Xiangji Huang. 2017. Modeling and learning distributed word representation with metadata for question retrieval. IEEE Trans. Knowl. Data Eng. 29, 6 (2017), 1226--1239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Guangyou Zhou, Zhiwen Xie, Jimmy Xiangji Huang, and Tingting He. 2016b. Bi-transferring deep neural networks for domain adaptation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16).Google ScholarGoogle ScholarCross RefCross Ref
  44. Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu. 2017. Emotional chatting machine: Emotional conversation generation with internal and external memory. CoRR abs/1704.01074 (2017). arxiv:1704.01074Google ScholarGoogle Scholar
  45. Xiangyang Zhou, Daxiang Dong, Hua Wu, Shiqi Zhao, Dianhai Yu, Hao Tian, Xuan Liu, and Rui Yan. 2016a. Multi-view response selection for human-computer conversation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 372--381.Google ScholarGoogle ScholarCross RefCross Ref
  46. Yimeng Zhuang, Xianliang Wang, Han Zhang, Jinghui Xie, and Xuan Zhu. 2017. An ensemble approach to conversation generation. In Proceedings of the 6th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC’17). 51--62.Google ScholarGoogle Scholar

Index Terms

(auto-classified)
  1. Neural Conversation Generation with Auxiliary Emotional Supervised Models

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 2
      March 2020
      301 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3358605
      Issue’s Table of Contents

      Copyright © 2019 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 September 2019
      • Accepted: 1 July 2019
      • Revised: 1 February 2019
      • Received: 1 April 2018
      Published in tallip Volume 19, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!