Abstract
How to accurately understand low-resource languages is the core of the task-oriented human-computer dialogue system. Language understanding consists of two sub-tasks, i.e., intent detection and slot filling. Intent detection still faces challenges due to semantic ambiguity and implicit intentions with users’ input. Moreover, separately modeling intent detection and slot filling significantly decrease the correctness and relevance between questions and answers. To address these issues, we propose a joint intent detection method using asynchronous training strategy. The proposed method firstly encodes local text information extracted by CNN and relationship information among words emphasized by attention structure. Later, a joint intent detection model with asynchronous training strategy is proposed by either fusing hidden states of intent detection and slot filling layers, or adopting the key information to fine-tune the whole network, greatly increasing the relevance of intent detection and slot filling subtasks. The accuracy achieved by the proposed method tested on an open-source airline travel dataset and a self-collected electricity service dataset, i.e., ATIS and ECSF, are 97.49% and 89.68%, respectively, which proves the effectiveness of joint learning and asynchronous training.
- [1] . 2020. Efficient intent detection with dual sentence encoders. CoRR abs/2003.04807.Google Scholar
- [2] . 2016. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 4960–4964.Google Scholar
Digital Library
- [3] . 2022. An edge intelligence empowered flooding process prediction using Internet of Things in smart city. J. Parallel and Distrib. Comput. 165 (2022), 66–78.Google Scholar
Cross Ref
- [4] . 2021. Data dissemination for Industry 4.0 applications in Internet of Vehicles based on short-term traffic prediction. ACM Transactions on Internet Technology (TOIT) 22, 1 (2021), 1–18.Google Scholar
Digital Library
- [5] . 2016. Long short-term memory-networks for machine reading. In Proceedings of Empirical Methods in Natural Language Processing. 551–561.Google Scholar
Cross Ref
- [6] . 2018. Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces. CoRR abs/1805.10190.Google Scholar
- [7] . 2021. ProtAugment: Intent detection meta-learning through unsupervised diverse paraphrasing. In Proceedings of Association for Computational Linguistics. 2454–2466.Google Scholar
Cross Ref
- [8] . 2019. A novel bi-directional interrelated model for joint intent detection and slot filling. In Proceedings of Association for Computational Linguistics. 5467–5471.Google Scholar
- [9] . 2021. An evaluation of Chinese human-computer dialogue technology. Data Intell. 3, 2, 274–286.Google Scholar
Cross Ref
- [10] . 2021. Multilingual and cross-lingual intent detection from spoken data. In Empirical Methods in Natural Language Processing. 7468–7475.Google Scholar
- [11] . 2018. Slot-gated modeling for joint slot filling and intent prediction. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 753–757.Google Scholar
Cross Ref
- [12] . 2014. Joint semantic utterance classification and slot filling with recursive neural networks. In Proceedings of IEEE Spoken Language Technology Workshop. 554–559.Google Scholar
Cross Ref
- [13] . 2016. Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In Proceedings of Annual Conference of the International Speech Communication Association. 715–719.Google Scholar
Cross Ref
- [14] . 2008. Triangular-chain conditional random fields. IEEE Trans. Speech Audio Process. 16, 7, 1287–1302.Google Scholar
Digital Library
- [15] . 2014. Convolutional neural networks for sentence classification. In Proceedings of Empirical Methods in Natural Language Processing. 1746–1751.Google Scholar
Cross Ref
- [16] . 2016. Leveraging sentence-level information with encoder LSTM for semantic slot filling. In Proceedings of Empirical Methods in Natural Language Processing. 2077–2083.Google Scholar
Cross Ref
- [17] . 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of International Conference on Machine Learning. 282–289.Google Scholar
Digital Library
- [18] . 2019. An evaluation dataset for intent classification and out-of-scope prediction. In Empirical Methods in Natural Language Processing. 1311–1316.Google Scholar
- [19] . 2018. A self-attentive model with gate mechanism for spoken language understanding. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 3824–3833.Google Scholar
Cross Ref
- [20] . 2006. Learning question classifiers: The role of semantic information. Nat. Lang. Eng. 12, 3, 229–249.Google Scholar
Digital Library
- [21] . 2021. ASRNN: A recurrent neural network with an attention model for sequence labeling. Knowl. Based Syst., 106548.Google Scholar
Cross Ref
- [22] . 2016. Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454 (2016).Google Scholar
- [23] . 2016. Attention-based recurrent neural network models for joint intent detection and slot filling. In Proceedings of International Speech Communication Association. 685–689.Google Scholar
Cross Ref
- [24] . 2019. Benchmarking natural language understanding services for building conversational agents. In Proceedings of International Workshop on Spoken Dialog System Technology. 165–183.Google Scholar
- [25] . 2015. Effective approaches to attention-based neural machine translation. In Proceedings of Empirical Methods in Natural Language Processing. 1412–1421.Google Scholar
Cross Ref
- [26] . 2021. Convolutional recurrent neural networks for text classification. J. Database Manag. 32, 4, 65–82.Google Scholar
Digital Library
- [27] . 2018. Dialogue systems for intelligent human computer interactions. In 1st Workshop on Behavioral Change and Ambient Intelligence for Sustainability and 2nd Workshop on Affective Interaction with Avatars and Robots. 57–71.Google Scholar
- [28] . 2021. Energy-based unknown intent detection with data manipulation. In Proceedings of Association for Computational Linguistics. 2852–2861.Google Scholar
Cross Ref
- [29] . 2015. Recurrent neural network and LSTM models for lexical utterance classification. In Proceedings of International Speech Communication Association. 135–139.Google Scholar
Cross Ref
- [30] . 2019. Cross-lingual transfer learning for multilingual task oriented dialog. In Proceedings of North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3795–3805.Google Scholar
Cross Ref
- [31] . 2021. Self-attention-based conditional random fields latent variables model for sequence labeling. Pattern Recognit. Lett., 157–164.Google Scholar
Digital Library
- [32] . 2014. Sequence to sequence learning with neural networks. In Proceedings of Neural Information Processing Systems. 3104–3112.Google Scholar
- [33] . 2021. Encoding syntactic knowledge in transformer encoder for intent detection and slot filling. In Proceedings of AAAI Conference on Artificial Intelligence. 13943–13951.Google Scholar
Cross Ref
- [34] . 2017. A new concept using LSTM neural networks for dynamic system identification. In Proceedings of America Control Conference. 5324–5329.Google Scholar
Cross Ref
- [35] . 2018. A bi-model based RNN semantic frame parsing model for intent detection and slot filling. arXiv preprint arXiv:1812.10235 (2018).Google Scholar
- [36] . 2022. Edge computing driven low-light image dynamic enhancement for object detection. IEEE Transactions on Network Science and Engineering (2022).Google Scholar
Cross Ref
- [37] . 2021. Multiple attention encoded cascade R-CNN for scene text detection. Journal of Visual Communication and Image Representation 80 (2021), 103261.Google Scholar
Digital Library
- [38] . 2021. Multi-scale relation reasoning for multi-modal Visual Question Answering. Signal Processing: Image Communication 96 (2021), 116319.Google Scholar
Cross Ref
- [39] . 2018. Zero-shot user intent detection via capsule neural networks. In Proceedings of Empirical Methods in Natural Language Processing. 3090–3099.Google Scholar
Cross Ref
- [40] . 2013. Convolutional neural network based triangular CRF for joint intent detection and slot filling. In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding. 78–83.Google Scholar
Cross Ref
- [41] . 2020. End-to-end slot alignment and recognition for cross-lingual NLU. In Empirical Methods in Natural Language Processing. 5052–5063.Google Scholar
- [42] . 2014. Spoken language understanding using long short-term memory neural networks. In Proceedings of IEEE Spoken Language Technology. 189–194.Google Scholar
Cross Ref
- [43] . 2021. Out-of-scope intent detection with self-supervision and discriminative training. In Proceedings of Association for Computational Linguistics. 3521–3532.Google Scholar
Cross Ref
- [44] . 2016. A joint model of intent determination and slot filling for spoken language understanding. In Proceedings of International Joint Conference on Artificial Intelligence. 2993–2999.Google Scholar
- [45] . 2017. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 5675–5679.Google Scholar
Digital Library
Index Terms
Joint Intent Detection Model for Task-oriented Human-Computer Dialogue System using Asynchronous Training
Recommendations
Historical Information-Based Intent Detection for Multiturn Dialogue
ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial IntelligenceIntent detection aims to determine the intent of users, an important task in natural language processing and dialogue systems. As one of the key modules of task-based dialogue systems, intent detection directly influences the meaning analysis of spoken ...
Mechanisms for dynamically changing initiative in human-computer collaborative discourse
HICS '96: Proceedings of the 3rd Symposium on Human Interaction with Complex Systems (HICS '96)In this paper, we examine three inter-related efficiency-improving dialogue behaviors: automatic dialogue initiative setting, negotiation for conflict resolution, and summaries for plan recognition assistance. We show how to incorporate these behaviors ...
Training a Dialogue Act Tagger for human-human and human-computer travel dialogues
SIGDIAL '02: Proceedings of the 3rd SIGdial workshop on Discourse and dialogue - Volume 2While dialogue acts provide a useful schema for characterizing dialogue behaviors in human-computer and human-human dialogues, their utility is limited by the huge effort involved in hand-labelling dialogues with a dialogue act labelling scheme. In this ...






Comments