skip to main content
research-article

A Unified Dialogue Management Strategy for Multi-intent Dialogue Conversations in Multiple Languages

Authors Info & Claims
Published:20 September 2021Publication History
Skip Abstract Section

Abstract

Building Virtual Agents capable of carrying out complex queries of the user involving multiple intents of a domain is quite a challenge, because it demands that the agent manages several subtasks simultaneously. This article presents a universal Deep Reinforcement Learning framework that can synthesize dialogue managers capable of working in a task-oriented dialogue system encompassing various intents pertaining to a domain. The conversation between agent and user is broken down into hierarchies, to segregate subtasks pertinent to different intents. The concept of Hierarchical Reinforcement Learning, particularly options, is used to learn policies in different hierarchies that operates in distinct time steps to fulfill the user query successfully. The dialogue manager comprises top-level intent meta-policy to select among subtasks or options and a low-level controller policy to pick primitive actions to communicate with the user to complete the subtask provided to it by the top-level policy in varying intents of a domain. The proposed dialogue management module has been trained in a way such that it can be reused for any language for which it has been developed with little to no supervision. The developed system has been demonstrated for “Air Travel” and “Restaurant” domain in English and Hindi languages. Empirical results determine the robustness and efficacy of the learned dialogue policy as it outperforms several baselines and a state-of-the-art system.

References

  1. Iñigo Casanueva, Pawel Budzianowski, Pei-Hao Su, Stefan Ultes, Lina Maria Rojas-Barahona, Bo-Hsiang Tseng, and Milica Gasic. 2018. Feudal reinforcement learning for dialogue management in large domains. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’18). 714–719. Retrieved from https://aclanthology.info/papers/N18-2112/n18-2112.Google ScholarGoogle ScholarCross RefCross Ref
  2. Qian Chen, Zhu Zhuo, and Wen Wang. 2019. BERT for joint intent classification and slot filling. Retrieved from https://arXiv:1902.10909.Google ScholarGoogle Scholar
  3. Heriberto Cuayáhuitl. 2017. SimpleDS: A simple deep reinforcement learning dialogue system. In Dialogues with Social Robots. Springer, Berlin, 109–118.Google ScholarGoogle Scholar
  4. Heriberto Cuayáhuitl, Seunghak Yu et al. 2017. Deep reinforcement learning of dialogue policies with less weight updates. In Proceedings of the18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  5. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735–1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Peter Lajos Ihasz, Mate Kovacs, Ian Piumarta, and Victor V. Kryssanov. 2019. A supplementary feature set for sentiment analysis in japanese dialogues. ACM Trans. Asian Low-Resour. Lang. Info. Process. 18, 4, Article 39 (May 2019), 21 pages. DOI:https://doi.org/10.1145/3310283 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Harksoo Kim and Jungyun Seo. 2003. Resolution of referring expressions in a Korean multimodal dialogue system. ACM Trans. Asian Low-Resour. Lang. Info. Process. 2, 4 (Dec. 2003), 324–337. DOI:https://doi.org/10.1145/1007551.1007553 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Michael F. McTear. 2002. Spoken dialogue technology: Enabling the conversational user interface. ACM Comput. Surveys 34, 1 (2002), 90–169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Baolin Peng, Xiujun Li, Jianfeng Gao, Jingjing Liu, Yun-Nung Chen, and Kam-Fai Wong. 2018. Adversarial advantage actor-critic model for task-completion dialogue policy learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’18). IEEE, 6149–6153.Google ScholarGoogle ScholarCross RefCross Ref
  10. Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli Çelikyilmaz, Sungjin Lee, and Kam-Fai Wong. 2017. Composite task-completion dialogue policy learning via hierarchical deep reinforcement learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2231–2240. Retrieved from https://aclanthology.info/papers/D17-1237/d17-1237.Google ScholarGoogle ScholarCross RefCross Ref
  11. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532–1543.Google ScholarGoogle ScholarCross RefCross Ref
  12. Patti J. Price. 1990. Evaluation of spoken language systems: The ATIS domain. In Proceedings of the Speech and Natural Language Workshop. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Lina Maria Rojas-Barahona, Milica Gasic, Nikola Mrksic, Pei-Hao Su, Stefan Ultes, Tsung-Hsien Wen, Steve J. Young, and David Van Dyke. 2017. A network-based end-to-end trainable task-oriented dialogue system. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL’17). 438–449. Retrieved from https://aclanthology.info/papers/E17-1042/e17-1042.Google ScholarGoogle Scholar
  14. Tulika Saha, Dhawal Gupta, Sriparna Saha, and Pushpak Bhattacharyya. 2018. Reinforcement learning based dialogue management strategy. In Proceedings of the 25th International Conference on Neural Information Processing (ICONIP’18), Long Cheng, Andrew Chi-Sing Leung, and Seiichi Ozawa (Eds.). Springer. https://doi.org/10.1007/978-3-030-04182-3_32Google ScholarGoogle ScholarCross RefCross Ref
  15. Tulika Saha, Dhawal Gupta, Sriparna Saha, and Pushpak Bhattacharyya. 2020. A hierarchical approach for efficient multi-intent dialogue policy learning. Multimedia Tools Appl. (2020), 1–26.Google ScholarGoogle Scholar
  16. Tulika Saha, Dhawal Gupta, Sriparna Saha, and Pushpak Bhattacharyya. 2020. Towards integrated dialogue policy learning for multiple domains and intents using hierarchical deep reinforcement learning. Expert Syst. Appl. 162 (2020), 113650.Google ScholarGoogle ScholarCross RefCross Ref
  17. Tulika Saha, Sriparna Saha, and Pushpak Bhattacharyya. 2020. Towards sentiment aided dialogue policy learning for multi-intent conversations using hierarchical reinforcement learning. PloS One 15, 7 (2020), e0235367.Google ScholarGoogle ScholarCross RefCross Ref
  18. Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2016. Prioritized experience replay. In Proceedings of the 4th International Conference on Learning Representations (ICLR’16). Retrieved from http://arxiv.org/abs/1511.05952.Google ScholarGoogle Scholar
  19. Pei-Hao Su, Pawel Budzianowski, Stefan Ultes, Milica Gasic, and Steve J. Young. 2017. Sample-efficient actor-critic reinforcement learning with supervised data for dialogue management. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, Kristiina Jokinen, Manfred Stede, David DeVault, and Annie Louis (Eds.). Association for Computational Linguistics, 147–157. DOI:https://doi.org/10.18653/v1/w17-5518Google ScholarGoogle Scholar
  20. Richard S. Sutton, Doina Precup, and Satinder Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artific. Intell. 112, 1-2 (1999), 181–211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Da Tang, Xiujun Li, Jianfeng Gao, Chong Wang, Lihong Li, and Tony Jebara. 2018. Subgoal discovery for hierarchical dialogue policy learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2298–2309. Retrieved from https://aclanthology.info/papers/D18-1253/d18-1253.Google ScholarGoogle ScholarCross RefCross Ref
  22. Bernard L. Welch. 1947. The generalization of students’ problem when several different population variances are involved. Biometrika 34, 1/2 (1947), 28–35.Google ScholarGoogle ScholarCross RefCross Ref
  23. Xiaodong Zhang and Houfeng Wang. 2016. A joint model of intent determination and slot filling for spoken language understanding. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’16), Vol. 16. 2993–2999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Unified Dialogue Management Strategy for Multi-intent Dialogue Conversations in Multiple Languages

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 6
      November 2021
      439 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3476127
      Issue’s Table of Contents

      Copyright © 2021 Association for Computing Machinery.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 September 2021
      • Accepted: 1 April 2021
      • Revised: 1 September 2020
      • Received: 1 August 2020
      Published in tallip Volume 20, Issue 6

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)44
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!