skip to main content
research-article

A Task-oriented Chatbot Based on LSTM and Reinforcement Learning

Authors Info & Claims
Published:25 November 2022Publication History
Skip Abstract Section

Abstract

Thanks to the advancements in deep learning, chatbots are widely used in messaging applications. Undoubtedly, a chatbot is a new way of interaction between humans and machines. However, most of the chatbots act as a simple question answering system that responds with formulated answers. Traditional conversational chatbots usually adopt a retrieval-based model that requires a large amount of conversational data for retrieving various intents. Hence, training a chatbot model that uses low-resource conversational data to generate more diverse dialogues is desirable. We propose a method to build a task-oriented chatbot using a sentence generation model that generates sequences based on the generative adversarial network. The architecture of our model contains a generator that generates a diverse sentence and a discriminator that judges the sentences by comparing the generated and the ground-truth sentences. In the generator, we combine the attention model with the sequence-to-sequence model using hierarchical long short-term memory to extract sentence information. For the discriminator, our reward mechanism assigns low rewards for repeated sentences and high rewards for diverse sentences. Extensive experiments are presented to demonstrate the utility of our model that generates more diverse and information-rich sentences than those of the existing approaches.

REFERENCES

  1. [1] 2015. LTP-Cloud. Retrieved from https://www.ltp-cloud.com.Google ScholarGoogle Scholar
  2. [2] 2017. THULAC. Retrieved from http://thulac.thunlp.org/.Google ScholarGoogle Scholar
  3. [3] 2021. Build Natural and Rich Conversational Experiences. Retrieved from https://dialogflow.com/.Google ScholarGoogle Scholar
  4. [4] 2021. Emotibot. Retrieved from http://www.emotibot.com/zh-tw/story.html?n=75.Google ScholarGoogle Scholar
  5. [5] 2021. ICTCLAS. Retrieved from http://ictclas.nlpir.org/.Google ScholarGoogle Scholar
  6. [6] 2021. Jieba. Retrieved from https://github.com/fxsjy/jieb/.Google ScholarGoogle Scholar
  7. [7] 2021. Language Understanding (LUIS). Retrieved from https://www.luis.ai/.Google ScholarGoogle Scholar
  8. [8] 2021. SIRI. Retrieved from https://www.apple.com/tw/siri/.Google ScholarGoogle Scholar
  9. [9] 2021. Wit.ai. Retrieved from https://wit.ai/.Google ScholarGoogle Scholar
  10. [10] Arjovsky Martin, Chintala Soumith, and Bottou Léon. 2017. Wasserstein gan. arXiv:1701.07875. Retrieved from https://arxiv.org/abs/1701.07875.Google ScholarGoogle Scholar
  11. [11] Bartl A. and Spanakis G.. 2017. A retrieval-based dialogue system utilizing utterance and context embeddings. In Proceedings of the 16th IEEE International Conference on Machine Learning and Applications (ICMLA’17). 11201125.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Clancey William J.. 1979. Dialogue management for rulebased tutorials. In Proceedings of the 6th International Joint Conference on Artificial Intelligence - Volume 1 (IJCAI’79). Morgan Kaufmann, San Francisco, CA, 155161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 41714186. Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Duchi John C., Hazan Elad, and Singer Yoram. 2010. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 61 (2010), 21212159.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Emerson Thomas. 2005. The second international chinese word segmentation bakeoff. In Proceedings of the 4th SIGHAN Workshop on Chinese Language Processing.Google ScholarGoogle Scholar
  16. [16] Eric Mihail and Manning Christopher. 2017. A copy-augmented sequence-to-sequence architecture gives good performance on task-oriented dialogue. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. Association for Computational Linguistics, 468473.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Huang Chen, Lucey Simon, and Ramanan Deva. 2017. Learning policies for adaptive tracking with deep feature cascades. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17), 105114.Google ScholarGoogle Scholar
  18. [18] Klein Guillaume, Kim Yoon, Deng Yuntian, Senellart Jean, and Rush Alexander. 2017. OpenNMT: Open-source toolkit for neural machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’17), System Demonstrations. Association for Computational Linguistics, 6772.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Lavie Alon and Agarwal Abhaya. 2007. Meteor: An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of the 2nd Workshop on Statistical Machine Translation (StatMT’07). Association for Computational Linguistics, 228231.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Li Jiwei, Galley Michel, Brockett Chris, Gao Jianfeng, and Dolan Bill. 2016. A diversity-promoting objective function for neural conversation models. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 110119. Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Li Jiwei, Monroe Will, Ritter Alan, Jurafsky Dan, Galley Michel, and Gao Jianfeng. 2016. Deep reinforcement learning for dialogue generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11921202. Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Li Jiwei, Monroe Will, Shi Tianlin, Jean Sébastien, Ritter Alan, and Jurafsky Dan. 2017. Adversarial learning for neural dialogue generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 21572169. Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Li Xiujun, Chen Yun-Nung, Li Lihong, Gao Jianfeng, and Celikyilmaz Asli. 2017. End-to-end task-completion neural dialogue systems. arXiv:1703.01008. Retrieved from https://arxiv.org/abs/1703.01008.Google ScholarGoogle Scholar
  24. [24] Lin Chin-Yew. 2004. ROUGE: A package for automatic evaluation of summaries. In Proceedings of the ACL Workshop: Text Summarization Branches Out, 10.Google ScholarGoogle Scholar
  25. [25] Liu Bing and Lane Ian. 2018. End-to-end learning of task-oriented dialogs. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop. Association for Computational Linguistics, 6773.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Liu Bing, Tür Gökhan, Hakkani-Tür Dilek, Shah Pararth, and Heck Larry P.. 2017. End-to-end optimization of task-oriented dialogue model with deep reinforcement learning. arXiv:1711.10712. Retrieved from http://arxiv.org/abs/1711.10712.Google ScholarGoogle Scholar
  27. [27] Liu Wei, Zheng Yu, Chawla Sanjay, Yuan Jing, and Xing Xie. 2011. Discovering spatio-temporal causal interactions in traffic data streams. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 10101018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Luong Thang, Pham Hieu, and Manning Christopher D.. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 14121421. Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Murphy Kevin P.. 2012. Machine Learning: A Probabilistic Perspective. Cambridge, MA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Peters Matthew E., Neumann Mark, Iyyer Mohit, Gardner Matt, Clark Christopher, Lee Kenton, and Zettlemoyer Luke. 2018. Deep contextualized word representations. arXiv:1802.05365. Retrieved from https://arxiv.org/abs/1802.05365.Google ScholarGoogle Scholar
  31. [31] Robertson Stephen. 2004. Understanding inverse document frequency: On theoretical arguments for IDF. J Doc. 60 (102004), 503520. Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Rubinstein Reuven Y. and Kroese Dirk P.. 2013. The Cross-entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning. Springer Science & Business Media.Google ScholarGoogle Scholar
  33. [33] Schuster Mike and Paliwal Kuldip K.. 1997. Bidirectional recurrent neural networks. IEEE Trans. Sign. Process. 45, 11 (1997), 26732681.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Serban Iulian V., Sordoni Alessandro, Bengio Yoshua, Courville Aaron, and Pineau Joelle. 2015. Hierarchical neural network generative models for movie dialogues. arXiv:1507.04808. Retrieved from https://arxiv.org/abs/1507.04808.Google ScholarGoogle Scholar
  35. [35] Shah Pararth, Hakkani-Tür Dilek, Liu Bing, and Tür Gokhan. 2018. Bootstrapping a neural conversational agent with dialogue self-play, crowdsourcing and on-line reinforcement learning. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers). Association for Computational Linguistics, 4151. Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Shu Lei, Molino Piero, Namazifar Mahdi, Xu Hu, Liu Bing, Zheng Huaixiu, and Tür Gökhan. 2019. Flexibly-structured model for task-oriented dialogues. arXiv:1908.02402. Retrieved from http://arxiv.org/abs/1908.02402.Google ScholarGoogle Scholar
  37. [37] Sutskever Ilya, Vinyals Oriol, and Le Quoc V. 2014. Sequence to sequence learning with neural networks. In Proceedings of Advances in Neural Information Processing Systems, Vol. 2. 31043112.Google ScholarGoogle Scholar
  38. [38] Sutton Richard S. and Barto Andrew G.. 2011. Reinforcement learning: An introduction. MIT Press, Cambridge, Massachusetts.Google ScholarGoogle Scholar
  39. [39] Vedantam Ramakrishna, Zitnick C. Lawrence, and Parikh Devi. 2015. CIDEr: Consensus-based image description evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15), 45664575.Google ScholarGoogle Scholar
  40. [40] Wen Tsung-Hsien, Vandyke David, Mrkšić Nikola, Gašić Milica, Rojas-Barahona Lina M., Su Pei-Hao, Ultes Stefan, and Young Steve. 2017. A network-based end-to-end trainable task-oriented dialogue system. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Association for Computational Linguistics, 438449.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Williams Jason D., Asadi Kavosh, and Zweig Geoffrey. 2017. Hybrid code networks: Practical and efficient end-to-end dialog control with supervised and reinforcement learning. arXiv:1702.03274. Retrieved from https://arxiv.org/abs/1702.03274.Google ScholarGoogle Scholar
  42. [42] Williams Ronald J.. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 3 (1992), 229256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Wu Yonghui, Schuster Mike, Chen Zhifeng, Le Quoc V, Norouzi Mohammad, Macherey Wolfgang, Krikun Maxim, Cao Yuan, Gao Qin, Macherey Klaus, et al. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144. Retrieved from https://arxiv.org/abs/1609.08144.Google ScholarGoogle Scholar
  44. [44] Xu Jingjing, Ren Xuancheng, Lin Junyang, and Sun Xu. 2018. DP-GAN: Diversity-promoting generative adversarial network for generating informative and diversified text. arXiv:1802.01345. Retrieved from https://arxiv.org/abs/1802.01345.Google ScholarGoogle Scholar
  45. [45] Yu Ke, Dong Chao, Lin Liang, and Loy Chen Change. 2018. Crafting a toolchain for image restoration by deep reinforcement learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 24432452.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Yu Lantao, Zhang Weinan, Wang Jun, and Yu Yong. 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2852–2858.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Zhou Kaiyang, Qiao Yu, and Xiang Tao. 2018. Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, McIlraith Sheila A. and Weinberger Kilian Q. (Eds.). AAAI Press, 75827589.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Zhu Jun-Yan, Park Taesung, Isola Phillip, and Efros Alexei A.. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 22232232.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A Task-oriented Chatbot Based on LSTM and Reinforcement Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 1
      January 2023
      340 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3572718
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 November 2022
      • Online AM: 11 April 2022
      • Accepted: 30 March 2022
      • Revised: 7 February 2022
      • Received: 26 December 2020
      Published in tallip Volume 22, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)345
      • Downloads (Last 6 weeks)25

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!