skip to main content
research-article

Response Generation by Jointly Modeling Personalized Linguistic Styles and Emotions

Authors Info & Claims
Published:16 February 2022Publication History
Skip Abstract Section

Abstract

Natural language generation (NLG) has been an essential technique for various applications, like XiaoIce and Siri, and engaged increasing attention recently. To improve the user experience, several emotion-aware NLG methods have been developed to generate responses coherent with a pre-designated emotion (e.g., the positive or negative). Nevertheless, existing methods cannot generate personalized responses as they frequently overlook the personalized linguistic style. Apparently, different human responsers tend to have different linguistic styles. Inspired by this, in this work, we focus on a novel research theme of personalized emotion-aware NLG (PENLG), whereby the generated responses should be coherent with the linguistic style of a pre-designated responser and emotion. In particular, we study PENLG under a scenario of generating personalized emotion-aware response for social media post. Yet it faces certain research challenges: (1) the user linguistic styles are implicit and complex by nature, and hence it is hard to learn their representations; and (2) linguistic styles and emotions are usually expressed in different manners in a response, and thus how to convey them properly in the generated responses is not easy. Toward this end, we present a novel scheme of PENLG, named CRobot, which consists of a personalized emotion-aware response generator and two discriminators, i.e., general discriminator and personalized emotion-aware discriminator. To be more specific, the post-based and avatar-based user linguistic style modeling methods are incorporated into the encoder-decoder–based generator, while the discriminators are devised to ensure that the generated response is fluent and consistent with both the emotion and the linguistic style of the user. Different from the traditional adversarial networks, we embed adversarial learning under the umbrella of reinforcement learning. In this way, the response generation problem can be tackled by the generator taking a sequence of actions on selecting the proper word of each timestep for output. To justify our model, we construct a large-scale response generation dataset based on Twitter, consisting of 6,763 tweets with a corresponding 1,461,713 response created by 153,664 users. Extensive experiments demonstrate that CRobot surpasses the state-of-the-art baselines regarding both subjective and objective evaluation.

REFERENCES

  1. [1] Alloatti Francesca, Caro Luigi Di, and Sportelli Gianpiero. 2019. Real life application of a question answering system using BERT language model. In Proceedings of the Annual SIGdial Meeting on Discourse and Dialogue. 250253.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Azzopardi Leif, Girolami Mark A., and Rijsbergen Keith van. 2003. Investigating the relationship between language model perplexity and IR precision-recall measures. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 369370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Baziotis Christos, Pelekis Nikos, and Doulkeridis Christos. 2017. DataStories at SemEval-2017 task 4: Deep LSTM with attention for message-level and topic-based sentiment analysis. In Proceedings of the International Workshop on Semantic Evaluation. 747754.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Brader Ted, Marcus George E., and Miller Kristyn L.. 2011. Emotion and public opinion. In The Oxford Handbook of American Public Opinion and the Media.Google ScholarGoogle Scholar
  5. [5] Cho Kyunghyun, Merrienboer Bart van, Gülçehre Çaglar, Bahdanau Dzmitry, Bougares Fethi, Schwenk Holger, and Bengio Yoshua. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 17241734.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Dai Quanyu, Shen Xiao, Zhang Liang, Li Qiang, and Wang Dan. 2019. Adversarial training methods for network embedding. In The World Wide Web Conference. 329339. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Feng Zhen-Hua, Kittler Josef, Christmas Bill, and Wu Xiao-Jun. 2019. A unified tensor-based active appearance model. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 3s (2019), 22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Fong Katrina and Mar Raymond A.. 2015. What does my avatar say about me? Inferring personality from avatars. Personality and Social Psychology Bulletin 41, 2 (2015), 237249.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Fu Chengpeng, Wang Jinqiang, Sang Jitao, Yu Jian, and Xu Changsheng. 2020. Beyond literal visual modeling: Understanding image metaphor based on literal-implied concept mapping. In Proceedings of the 26th International Conference on MultiMedia Modeling (MMM’20), Part I, Vol. 11961. Springer, 111123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Goodfellow Ian J., Pouget-Abadie Jean, Mirza Mehdi, Xu Bing, Warde-Farley David, Ozair Sherjil, Courville Aaron C., and Bengio Yoshua. 2014. Generative adversarial networks. CoRR abs/1406.2661. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Guo Jiaxian, Lu Sidi, Cai Han, Zhang Weinan, Yu Yong, and Wang Jun. 2018. Long text generation via adversarial training with leaked information. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18), the 30th innovative Applications of Artificial Intelligence (IAAI’18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’18). AAAI Press, 51415148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] He Ruining and McAuley Julian J.. 2016. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In The World Wide Web Conference. 507517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] He Shizhu, Liu Cao, Liu Kang, and Zhao Jun. 2017. Generating natural answers by incorporating copying and retrieving mechanisms in sequence-to-sequence learning. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 199208.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Hochreiter Sepp and Schmidhuber Jürgen. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 17351780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Hutto Clayton J. and Gilbert Eric. 2014. VADER: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the 8th International Conference on Weblogs and Social Media. 216225.Google ScholarGoogle Scholar
  17. [17] Jin Yanghua, Zhang Jiakai, Li Minjun, Tian Yingtao, Zhu Huachun, and Fang Zhihao. 2017. Towards the automatic anime characters creation with generative adversarial networks. CoRR abs/1708.05509.Google ScholarGoogle Scholar
  18. [18] Kingma Diederik P. and Ba Jimmy. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR’15), Conference Track Proceedings. 116.Google ScholarGoogle Scholar
  19. [19] Koleejan Chahine, Takamura Hiroya, and Okumura Manabu. 2019. Generating objective summaries of sports matches using social media. In IEEE/WIC/ACM International Conference on Web Intelligence. 353357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Krizhevsky Alex, Sutskever Ilya, and Hinton Geoffrey E.. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 11061114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Lee Hanbit, Ahn Yeonchan, Lee Haejun, Ha Seungdo, and Lee Sang-goo. 2016. Quote recommendation in dialogue using deep neural network. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 957960. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Li Chuan and Wand Michael. 2016. Precomputed real-time texture synthesis with Markovian generative adversarial networks. In Proceedings of the 14th European Conference on Computer Vision (ECCV’16), Part III, Vol. 9907. Springer, 702716.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Li Jiwei, Galley Michel, Brockett Chris, Spithourakis Georgios P., Gao Jianfeng, and Dolan William B.. 2016. A persona-based neural conversation model. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 9941003.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Li Jiwei, Monroe Will, Shi Tianlin, Jean Sébastien, Ritter Alan, and Jurafsky Dan. 2017. Adversarial learning for neural dialogue generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 21572169.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Li Pan and Tuzhilin Alexander. 2019. Towards controllable and personalized review generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the International Joint Conference on Natural Language Processing. 32353243.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Lin Chin-Yew. 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out. 7481.Google ScholarGoogle Scholar
  27. [27] Lin Hsin and Wang Hua. 2014. Avatar creation in virtual worlds: Behaviors and motivations. Computers in Human Behavior 34 (2014), 213218.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Lin Kevin, Li Dianqi, He Xiaodong, Sun Ming-Ting, and Zhang Zhengyou. 2017. Adversarial ranking for language generation. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems. 31553165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Liu Chunxiao, Mao Zhendong, Liu An-An, Zhang Tianzhu, Wang Bin, and Zhang Yongdong. 2019. Focus your attention: A bidirectional focal attention network for image-text matching. In Proceedings of the ACM International Conference on Multimedia. 311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Liu Jinhuan, Song Xuemeng, Ren Zhaochun, Nie Liqiang, Tu Zhaopeng, and Ma Jun. 2020. Auxiliary template-enhanced generative compatibility modeling. In International Joint Conference on Artificial Intelligence. ijcai.org, 35083514. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Liu Zhiwei, Zhu Xiangyu, Tang Ming, Lei Zhen, and Wang Jinqiao. 2019. Efficient face alignment with fast normalization and contour fitting loss. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 3s (2019), 16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Nie Liqiang, Li Yongqi, Feng Fuli, Song Xuemeng, Wang Meng, and Wang Yinglong. 2020. Large-scale question tagging via joint question-topic embedding learning. ACM Transactions on Information Systems 38, 2 (2020), 20:1–20:23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Oraby Shereen, Harrison Vrindavan, Ebrahimi Abteen, and Walker Marilyn. 2019. Curate and generate: A corpus and method for joint control of semantics and style in neural NLG. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 59385951.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Ozsoy Makbule Gulcin, Cicekli Ilyas, and Alpaslan Ferda Nur. 2010. Text summarization of Turkish texts using latent semantic analysis. In Proceedings of the International Conference on Computational Linguistics. 869876. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Papineni Kishore, Roukos Salim, Ward Todd, and Zhu Wei-Jing. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 311318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Qu Chen, Yang Liu, Qiu Minghui, Croft W. Bruce, Zhang Yongfeng, and Iyyer Mohit. 2019. BERT with history answer embedding for conversational question answering. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 11331136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Sang Jitao, Zhang Huaiwen, and Xu Changsheng. 2016. Visual BFI: An exploratory study for image-based personality test. In Proceedings of the 17th Pacific-Rim Conference on Multimedia, Advances in Multimedia Processing (PCM’16), Part I, Chen Enqing, Gong Yihong, and Tie Yun (Eds.). Vol. 9916. Springer, 95106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Sarkar Anindya, Reddy Sujeeth, and Iyengar Raghu Sesha. 2019. Zero-shot multilingual sentiment analysis using hierarchical attentive network and BERT. In Proceedings of the International Conference on Natural Language Processing and Information Retrieval. 4956. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Shandilya Anurag, Ghosh Kripabandhu, and Ghosh Saptarshi. 2018. Fairness of extractive text summarization. In Companion Proceedings of the The Web Conference. 9798. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Sohn Hajung and Lee Hyunju. 2019. MC-BERT4HATE: Hate speech detection using multi-channel BERT for different languages and translations. In International Conference on Data Mining Workshops (ICDMW’19). 551559.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Song Xinhang, Zhang Sixian, Hua Yuyun, and Jiang Shuqiang. 2019. Aberrance-aware gradient-sensitive attentions for scene recognition with RGB-D videos. In Proceedings of the ACM International Conference on Multimedia. 12861294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Tan Chaohong and Ling Zhenhua. 2019. Multi-classification model for spoken language understanding. In International Conference on Multimodal Interaction. 526530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Tang Jian, Yang Yifan, Carton Samuel, Zhang Ming, and Mei Qiaozhu. 2016. Context-aware natural language generation with recurrent neural networks. CoRR abs/1611.09900.Google ScholarGoogle Scholar
  44. [44] Tripathi Suvidha and Singh Satish Kumar. 2020. Cell nuclei classification in histopathological images using hybrid OLConvNet. ACM Transactions on Multimedia Computing, Communications, and Appications 16, 1s (2020), 22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Truong Quoc-Tuan and Lauw Hady W.. 2019. Multimodal review generation for recommender systems. In The World Wide Web Conference (WWW’19). 18641874. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30. 59986008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Wang Bingning, Liu Kang, and Zhao Jun. 2018. Deep semantic hashing with multi-adversarial training. In Proceedings of the ACM International Conference on Information and Knowledge Management. 14531462. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Wang Shui-Hua and Zhang Yu-Dong. 2020. DenseNet-201-based deep neural network with composite learning factor and precomputation for multiple sclerosis classification. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 2s (2020), 19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Wei Yinwei, Wang Xiang, Guan Weili, Nie Liqiang, Lin Zhouchen, and Chen Baoquan. 2020. Neural multimodal cooperative learning toward micro-video understanding. IEEE Transactions on Image Procesing 29 (2020), 114.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Wei Yinwei, Wang Xiang, Nie Liqiang, He Xiangnan, Hong Richang, and Chua Tat-Seng. 2019. MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19). ACM, 14371445. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Williams Ronald J.. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 3–4 (1992), 229256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Wu Chenfei, Liu Jinlai, Wang Xiaojie, and Dong Xuan. 2018. Object-difference attention: A simple relational attention for visual question answering. In Proceedings of the ACM International Conference on Multimedia. 519527. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Yang Ze, Xu Can, Wu Wei, and Li Zhoujun. 2019. Read, attend and comment: A deep architecture for automatic news comment generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the International Joint Conference on Natural Language Processing. 50765088.Google ScholarGoogle ScholarCross RefCross Ref
  54. [54] Yao Yuanshun, Viswanath Bimal, Cryan Jenna, Zheng Haitao, and Zhao Ben Y.. 2017. Automated crowdturfing attacks and defenses in online review systems. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. 11431158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. [55] Yu Lantao, Zhang Weinan, Wang Jun, and Yu Yong. 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI Conference on Artificial Intelligence. 28522858. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. [56] Zeng Wenhuan, Abuduweili Abulikemu, Li Lei, and Yang Pengcheng. 2019. Automatic generation of personalized comment based on user profile. In Proceedings of the Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. 229235.Google ScholarGoogle ScholarCross RefCross Ref
  57. [57] Zhang Jiaming, Sang Jitao, Zhao Xian, Huang Xiaowen, Sun Yanfeng, and Hu Yongli. 2020. Adversarial privacy-preserving filter. In ACM Multimedia. ACM, 14231431. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Zhang Saizheng, Dinan Emily, Urbanek Jack, Szlam Arthur, Kiela Douwe, and Weston Jason. 2018. Personalizing dialogue agents: I have a dog, do you have pets too?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 22042213.Google ScholarGoogle ScholarCross RefCross Ref
  59. [59] Zhao Lujun, Song Kaisong, Sun Changlong, Zhang Qi, Huang Xuanjing, and Liu Xiaozhong. 2019. Review response generation in e-commerce platforms with external product information. In The World Wide Web Conference. 24252435. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. [60] Zhou Hao, Huang Minlie, Zhang Tianyang, Zhu Xiaoyan, and Liu Bing. 2018. Emotional chatting machine: Emotional conversation generation with internal and external memory. In Proceedings of the AAAI Conference on Artificial Intelligence. 730739. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Zhuang Yueting, Xu Dejing, Yan Xin, Cheng Wenzhuo, Zhao Zhou, Pu Shiliang, and Xiao Jun. 2020. Multichannel attention refinement for video question answering. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 1s (2020), 23. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Response Generation by Jointly Modeling Personalized Linguistic Styles and Emotions

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 2
      May 2022
      494 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3505207
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 February 2022
      • Accepted: 1 July 2021
      • Revised: 1 June 2021
      • Received: 1 April 2021
      Published in tomm Volume 18, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!