Abstract
Natural language generation (NLG) has been an essential technique for various applications, like XiaoIce and Siri, and engaged increasing attention recently. To improve the user experience, several emotion-aware NLG methods have been developed to generate responses coherent with a pre-designated emotion (e.g., the positive or negative). Nevertheless, existing methods cannot generate personalized responses as they frequently overlook the personalized linguistic style. Apparently, different human responsers tend to have different linguistic styles. Inspired by this, in this work, we focus on a novel research theme of personalized emotion-aware NLG (PENLG), whereby the generated responses should be coherent with the linguistic style of a pre-designated responser and emotion. In particular, we study PENLG under a scenario of generating personalized emotion-aware response for social media post. Yet it faces certain research challenges: (1) the user linguistic styles are implicit and complex by nature, and hence it is hard to learn their representations; and (2) linguistic styles and emotions are usually expressed in different manners in a response, and thus how to convey them properly in the generated responses is not easy. Toward this end, we present a novel scheme of PENLG, named CRobot, which consists of a personalized emotion-aware response generator and two discriminators, i.e., general discriminator and personalized emotion-aware discriminator. To be more specific, the post-based and avatar-based user linguistic style modeling methods are incorporated into the encoder-decoder–based generator, while the discriminators are devised to ensure that the generated response is fluent and consistent with both the emotion and the linguistic style of the user. Different from the traditional adversarial networks, we embed adversarial learning under the umbrella of reinforcement learning. In this way, the response generation problem can be tackled by the generator taking a sequence of actions on selecting the proper word of each timestep for output. To justify our model, we construct a large-scale response generation dataset based on Twitter, consisting of 6,763 tweets with a corresponding 1,461,713 response created by 153,664 users. Extensive experiments demonstrate that CRobot surpasses the state-of-the-art baselines regarding both subjective and objective evaluation.
- [1] . 2019. Real life application of a question answering system using BERT language model. In Proceedings of the Annual SIGdial Meeting on Discourse and Dialogue. 250–253.Google Scholar
Cross Ref
- [2] . 2003. Investigating the relationship between language model perplexity and IR precision-recall measures. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 369–370. Google Scholar
Digital Library
- [3] . 2017. DataStories at SemEval-2017 task 4: Deep LSTM with attention for message-level and topic-based sentiment analysis. In Proceedings of the International Workshop on Semantic Evaluation. 747–754.Google Scholar
Cross Ref
- [4] . 2011. Emotion and public opinion. In The Oxford Handbook of American Public Opinion and the Media.Google Scholar
- [5] . 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1724–1734.Google Scholar
Cross Ref
- [6] . 2019. Adversarial training methods for network embedding. In The World Wide Web Conference. 329–339. Google Scholar
Digital Library
- [7] . 2019. A unified tensor-based active appearance model. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 3s (2019), 22. Google Scholar
Digital Library
- [8] . 2015. What does my avatar say about me? Inferring personality from avatars. Personality and Social Psychology Bulletin 41, 2 (2015), 237–249.Google Scholar
Cross Ref
- [9] . 2020. Beyond literal visual modeling: Understanding image metaphor based on literal-implied concept mapping. In Proceedings of the 26th International Conference on MultiMedia Modeling (MMM’20), Part I, Vol. 11961. Springer, 111–123.Google Scholar
Digital Library
- [10] . 2014. Generative adversarial networks. CoRR abs/1406.2661. Google Scholar
Digital Library
- [11] . 2018. Long text generation via adversarial training with leaked information. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18), the 30th innovative Applications of Artificial Intelligence (IAAI’18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’18). AAAI Press, 5141–5148. Google Scholar
Digital Library
- [12] . 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google Scholar
Cross Ref
- [13] . 2016. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In The World Wide Web Conference. 507–517. Google Scholar
Digital Library
- [14] . 2017. Generating natural answers by incorporating copying and retrieving mechanisms in sequence-to-sequence learning. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 199–208.Google Scholar
Cross Ref
- [15] . 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780. Google Scholar
Digital Library
- [16] . 2014. VADER: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the 8th International Conference on Weblogs and Social Media. 216–225.Google Scholar
- [17] . 2017. Towards the automatic anime characters creation with generative adversarial networks. CoRR abs/1708.05509.Google Scholar
- [18] . 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR’15), Conference Track Proceedings. 1–16.Google Scholar
- [19] . 2019. Generating objective summaries of sports matches using social media. In IEEE/WIC/ACM International Conference on Web Intelligence. 353–357. Google Scholar
Digital Library
- [20] . 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1106–1114. Google Scholar
Digital Library
- [21] . 2016. Quote recommendation in dialogue using deep neural network. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 957–960. Google Scholar
Digital Library
- [22] . 2016. Precomputed real-time texture synthesis with Markovian generative adversarial networks. In Proceedings of the 14th European Conference on Computer Vision (ECCV’16), Part III, Vol. 9907. Springer, 702–716.Google Scholar
Cross Ref
- [23] . 2016. A persona-based neural conversation model. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 994–1003.Google Scholar
Cross Ref
- [24] . 2017. Adversarial learning for neural dialogue generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2157–2169.Google Scholar
Cross Ref
- [25] . 2019. Towards controllable and personalized review generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the International Joint Conference on Natural Language Processing. 3235–3243.Google Scholar
Cross Ref
- [26] . 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out. 74–81.Google Scholar
- [27] . 2014. Avatar creation in virtual worlds: Behaviors and motivations. Computers in Human Behavior 34 (2014), 213–218.Google Scholar
Cross Ref
- [28] . 2017. Adversarial ranking for language generation. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems. 3155–3165. Google Scholar
Digital Library
- [29] . 2019. Focus your attention: A bidirectional focal attention network for image-text matching. In Proceedings of the ACM International Conference on Multimedia. 3–11. Google Scholar
Digital Library
- [30] . 2020. Auxiliary template-enhanced generative compatibility modeling. In International Joint Conference on Artificial Intelligence. ijcai.org, 3508–3514. Google Scholar
Digital Library
- [31] . 2019. Efficient face alignment with fast normalization and contour fitting loss. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 3s (2019), 16. Google Scholar
Digital Library
- [32] . 2020. Large-scale question tagging via joint question-topic embedding learning. ACM Transactions on Information Systems 38, 2 (2020), 20:1–20:23. Google Scholar
Digital Library
- [33] . 2019. Curate and generate: A corpus and method for joint control of semantics and style in neural NLG. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 5938–5951.Google Scholar
Cross Ref
- [34] . 2010. Text summarization of Turkish texts using latent semantic analysis. In Proceedings of the International Conference on Computational Linguistics. 869–876. Google Scholar
Digital Library
- [35] . 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 311–318. Google Scholar
Digital Library
- [36] . 2019. BERT with history answer embedding for conversational question answering. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 1133–1136. Google Scholar
Digital Library
- [37] . 2016. Visual BFI: An exploratory study for image-based personality test. In Proceedings of the 17th Pacific-Rim Conference on Multimedia, Advances in Multimedia Processing (PCM’16), Part I, , , and (Eds.). Vol. 9916. Springer, 95–106. Google Scholar
Digital Library
- [38] . 2019. Zero-shot multilingual sentiment analysis using hierarchical attentive network and BERT. In Proceedings of the International Conference on Natural Language Processing and Information Retrieval. 49–56. Google Scholar
Digital Library
- [39] . 2018. Fairness of extractive text summarization. In Companion Proceedings of the The Web Conference. 97–98. Google Scholar
Digital Library
- [40] . 2019. MC-BERT4HATE: Hate speech detection using multi-channel BERT for different languages and translations. In International Conference on Data Mining Workshops (ICDMW’19). 551–559.Google Scholar
Cross Ref
- [41] . 2019. Aberrance-aware gradient-sensitive attentions for scene recognition with RGB-D videos. In Proceedings of the ACM International Conference on Multimedia. 1286–1294. Google Scholar
Digital Library
- [42] . 2019. Multi-classification model for spoken language understanding. In International Conference on Multimodal Interaction. 526–530. Google Scholar
Digital Library
- [43] . 2016. Context-aware natural language generation with recurrent neural networks. CoRR abs/1611.09900.Google Scholar
- [44] . 2020. Cell nuclei classification in histopathological images using hybrid OLConvNet. ACM Transactions on Multimedia Computing, Communications, and Appications 16, 1s (2020), 22. Google Scholar
Digital Library
- [45] . 2019. Multimodal review generation for recommender systems. In The World Wide Web Conference (WWW’19). 1864–1874. Google Scholar
Digital Library
- [46] . 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30. 5998–6008. Google Scholar
Digital Library
- [47] . 2018. Deep semantic hashing with multi-adversarial training. In Proceedings of the ACM International Conference on Information and Knowledge Management. 1453–1462. Google Scholar
Digital Library
- [48] . 2020. DenseNet-201-based deep neural network with composite learning factor and precomputation for multiple sclerosis classification. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 2s (2020), 19. Google Scholar
Digital Library
- [49] . 2020. Neural multimodal cooperative learning toward micro-video understanding. IEEE Transactions on Image Procesing 29 (2020), 1–14.Google Scholar
Cross Ref
- [50] . 2019. MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19). ACM, 1437–1445. Google Scholar
Digital Library
- [51] . 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 3–4 (1992), 229–256. Google Scholar
Digital Library
- [52] . 2018. Object-difference attention: A simple relational attention for visual question answering. In Proceedings of the ACM International Conference on Multimedia. 519–527. Google Scholar
Digital Library
- [53] . 2019. Read, attend and comment: A deep architecture for automatic news comment generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the International Joint Conference on Natural Language Processing. 5076–5088.Google Scholar
Cross Ref
- [54] . 2017. Automated crowdturfing attacks and defenses in online review systems. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. 1143–1158. Google Scholar
Digital Library
- [55] . 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI Conference on Artificial Intelligence. 2852–2858. Google Scholar
Digital Library
- [56] . 2019. Automatic generation of personalized comment based on user profile. In Proceedings of the Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. 229–235.Google Scholar
Cross Ref
- [57] . 2020. Adversarial privacy-preserving filter. In ACM Multimedia. ACM, 1423–1431. Google Scholar
Digital Library
- [58] . 2018. Personalizing dialogue agents: I have a dog, do you have pets too?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2204–2213.Google Scholar
Cross Ref
- [59] . 2019. Review response generation in e-commerce platforms with external product information. In The World Wide Web Conference. 2425–2435. Google Scholar
Digital Library
- [60] . 2018. Emotional chatting machine: Emotional conversation generation with internal and external memory. In Proceedings of the AAAI Conference on Artificial Intelligence. 730–739. Google Scholar
Digital Library
- [61] . 2020. Multichannel attention refinement for video question answering. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 1s (2020), 23. Google Scholar
Digital Library
Index Terms
Response Generation by Jointly Modeling Personalized Linguistic Styles and Emotions
Recommendations
ReBoost: a retrieval-boosted sequence-to-sequence model for neural response generation
AbstractHuman–computer conversation is an active research topic in natural language processing. One of the representative methods to build conversation systems uses the sequence-to-sequence (Seq2seq) model through neural networks. However, with limited ...
An emotion-based responding model for natural language conversation
As an important task of artificial intelligence, natural language conversation has attracted wide attention of researchers in natural language processing. Existing works in this field mainly focus on consistency of neural response generation whilst ...
An Emotion Generation Model for Interactive Virtual Robots
ISCID '08: Proceedings of the 2008 International Symposium on Computational Intelligence and Design - Volume 01Making a computer generate its own emotion is an important part of the affective computing, and this would have wide applications in human-computer interaction and artificial intelligence. In this paper, we will describe an emotion generation model for ...






Comments