Abstract
Spoken language understanding (SLU) has been addressed as a supervised learning problem, where a set of training data is available for each domain. However, annotating data for a new domain can be both financially costly and non-scalable. One existing approach solves the problem by conducting multi-domain learning where parameters are shared for joint training across domains, which is domain-agnostic and task-agnostic. In the article, we propose to improve the parameterization of this method by using domain-specific and task-specific model parameters for fine-grained knowledge representation and transfer. Experiments on five domains show that our model is more effective for multi-domain SLU and obtain the best results. In addition, we show its transferability when adapting to a new domain with little data, outperforming the prior best model by 12.4%. Finally, we explore the strong pre-trained model in our framework and find that the contributions from our framework do not fully overlap with contextualized word representations (RoBERTa).
- [1] . 2018. Snips voice platform: An embedded spoken language understanding system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190 (2018).Google Scholar
- [2] . 2019. Question answering by reasoning across documents with graph convolutional networks. In Proceedings of the NAACL. Association for Computational Linguistics, 2306–2317.
DOI: https://doi.org/10.18653/v1/N19-1240Google Scholar - [3] . 2019. A novel bi-directional interrelated model for joint intent detection and slot filling. In Proceedings of the ACL. Association for Computational Linguistics, 5467–5471.
DOI: https://doi.org/10.18653/v1/P19-1544Google ScholarCross Ref
- [4] . 2018. Slot-gated modeling for joint slot filling and intent prediction. In Proceedings of the NAACL. Association for Computational Linguistics, 753–757.
DOI: https://doi.org/10.18653/v1/N18-2118Google ScholarCross Ref
- [5] . 2019. Attention guided graph convolutional networks for relation extraction. In Proceedings of the ACL. Association for Computational Linguistics, 241–251.
DOI: https://doi.org/10.18653/v1/P19-1024Google ScholarCross Ref
- [6] . 2019. Densely connected graph convolutional networks for graph-to-sequence learning. Trans. Assoc. Computat. Ling. 7 (
Mar. 2019), 297–312.DOI: https://doi.org/10.1162/tacl_a_00269Google ScholarCross Ref
- [7] . 2018. An efficient approach to encoding context for spoken language understanding. arXiv preprint arXiv:1807.00267 (2018).Google Scholar
- [8] . 2003. Optimizing SVMs for complex call classification. In Proceedings of the ICASSP.Google Scholar
Cross Ref
- [9] . 2016. Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In Proceedings of the Interspeech.Google Scholar
Cross Ref
- [10] . 1990. The ATIS spoken language systems pilot corpus. In Proceedings of the Workshop on Speech and Natural Language. Google Scholar
Digital Library
- [11] . 1997. Long short-term memory. Neural Computat. 9, 8 (1997). Google Scholar
Digital Library
- [12] . 2019. Syntax-aware aspect level sentiment classification with graph attention networks. In Proceedings of the EMNLP.Google Scholar
Cross Ref
- [13] . 2017. OneNet: Joint domain, intent, slot prediction for spoken language understanding. In Proceedings of the ASRU. IEEE, 547–553.Google Scholar
Cross Ref
- [14] . 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- [15] . 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google Scholar
- [16] . 2019. Locale-agnostic universal domain classification model in spoken language understanding. In Proceedings of the NAACL. Association for Computational Linguistics.Google Scholar
Cross Ref
- [17] . 2018. A self-attentive model with gate mechanism for spoken language understanding. In Proceedings of the EMNLP. Association for Computational Linguistics, 3824–3833.
DOI: https://doi.org/10.18653/v1/D18-1417Google ScholarCross Ref
- [18] . 2019. KagNet: Knowledge-aware graph networks for commonsense reasoning. In Proceedings of the EMNLP. Association for Computational Linguistics, 2829–2839.
DOI: https://doi.org/10.18653/v1/D19-1282Google ScholarCross Ref
- [19] . 2017. Multi-domain adversarial learning for slot filling in spoken language understanding. arXiv preprint arXiv:1711.11310 (2017).Google Scholar
- [20] . 2017. Adversarial multi-task learning for text classification. In Proceedings of the ACL.Google Scholar
Cross Ref
- [21] . 2019. CM-Net: A novel collaborative memory network for spoken language understanding. In Proceedings of the EMNLP. Association for Computational Linguistics, 1051–1060.
DOI: https://doi.org/10.18653/v1/D19-1097Google ScholarCross Ref
- [22] . 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).Google Scholar
- [23] . 2020. Coach: A coarse-to-fine approach for cross-domain slot filling.
arxiv:cs.CL/2004.11727 (2020). Google Scholar - [24] . 2014. The Stanford CoreNLP natural language processing toolkit. In Proceedings of the ACL.Google Scholar
Cross Ref
- [25] . 2017. Encoding sentences with graph convolutional networks for semantic role labeling. In Proceedings of the EMNLP. Association for Computational Linguistics, 1506–1515.
DOI: https://doi.org/10.18653/v1/D17-1159Google ScholarCross Ref
- [26] . 2020. DCR-Net: A deep co-interactive relation network for joint dialog act recognition and sentiment classification. In AAAI. 8665–8672.Google Scholar
- [27] . 2019. A stack-propagation framework with token-level intent detection for spoken language understanding. In Proceedings of the EMNLP. Association for Computational Linguistics, 2078–2087.
DOI: https://doi.org/10.18653/v1/D19-1214Google ScholarCross Ref
- [28] . 2020. A co-interactive transformer for joint slot filling and intent detection. arXiv preprint arXiv:2010.03880 (2020).Google Scholar
- [29] . 2020. CoSDA-ML: Multi-lingual code-switching data augmentation for zero-shot cross-lingual NLP.
arxiv:cs.CL/2006.06402 (2020). Google ScholarDigital Library
- [30] . 2021. GL-GIN: Fast and accurate non-autoregressive model for joint multiple intent detection and slot filling. In Proceedings of the ACL.Google Scholar
Cross Ref
- [31] . 2021. A survey on spoken language understanding: Recent advances and new frontiers.
arxiv:cs.CL/2103.03095 (2021). Google Scholar - [32] . 2020. AGIF: An adaptive graph-interactive framework for joint multiple intent detection and slot filling. In Proceedings of the EMNLP. Association for Computational Linguistics, 1807–1816.
DOI: https://doi.org/10.18653/v1/2020.findings-emnlp.163Google ScholarCross Ref
- [33] . 2007. Generative and discriminative algorithms for spoken language understanding. In Proceedings of the Interspeech.Google Scholar
Cross Ref
- [34] . 2011. Deep belief nets for natural language call-routing. In Proceedings of the ICASSP.Google Scholar
Cross Ref
- [35] . 2019. Cross-lingual transfer learning for multilingual task oriented dialog. In Proceedings of the NAACL. Association for Computational Linguistics, 3795–3805.
DOI: https://doi.org/10.18653/v1/N19-1380Google ScholarCross Ref
- [36] . 2018. DiSAN: Directional self-attention network for RNN/CNN-free language understanding. In Proceedings of the AAAI. Google Scholar
Digital Library
- [37] . 2018. Linguistically informed self-attention for semantic role labeling. arXiv preprint arXiv:1804.08199 (2018).Google Scholar
- [38] . 2018. Deep semantic role labeling with self-attention. In Proceedings of the AAAI. Google Scholar
Digital Library
- [39] . 2021. Injecting word information with multi-level word adapter for Chinese spoken language understanding.
arxiv:cs.CL/2010.03903 (2021). Google Scholar - [40] . 2011. Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. John Wiley & Sons.Google Scholar
Cross Ref
- [41] . 2017. Attention is all you need. In Proceedings of the NIPS. Google Scholar
Digital Library
- [42] . 2018. A bi-model based RNN semantic frame parsing model for intent detection and slot filling. In Proceedings of the ACL.Google Scholar
Cross Ref
- [43] . 2018. Zero-shot user intent detection via capsule neural networks. In Proceedings of the EMNLP. Association for Computational Linguistics, 3090–3099.
DOI: https://doi.org/10.18653/v1/D18-1348Google ScholarCross Ref
- [44] . 2013. Convolutional neural network based triangular CRF for joint intent detection and slot filling. In Proceedings of the ASRU.Google Scholar
Cross Ref
- [45] . 2013. Exploiting shared information for multi-intent natural language sentence classification. In Proceedings of the Interspeech.Google Scholar
Cross Ref
- [46] . 2016. Multi-task cross-lingual sequence tagging from scratch. arXiv preprint arXiv:1603.06270 (2016).Google Scholar
- [47] . 2014. Spoken language understanding using long short-term memory neural networks. In Proceedings of the SLT.Google Scholar
Cross Ref
- [48] . 2013. POMDP-based statistical spoken dialog systems: A review. In Proceedings of the IEEE.
DOI: 10.1109/JPROC.2012.2225812Google Scholar - [49] . 2019. Aspect-based sentiment classification with aspect-specific graph convolutional networks. In Proceedings of the EMNLP.Google Scholar
Cross Ref
- [50] . 2019. Joint slot filling and intent detection via capsule neural networks. In Proceedings of the ACL.Google Scholar
Cross Ref
- [51] . 2016. A joint model of intent determination and slot filling for spoken language understanding. In Proceedings of the IJCAI. Google Scholar
Digital Library
- [52] . 2018. Graph convolution over pruned dependency trees improves relation extraction. In Proceedings of the EMNLP.Google Scholar
Cross Ref
- [53] . 2018. Global-locally self-attentive encoder for dialogue state tracking. In Proceedings of the ACL. Association for Computational Linguistics, 1458–1467.
DOI: https://doi.org/10.18653/v1/P18-1135Google ScholarCross Ref
- [54] . 2018. Global-locally self-attentive encoder for dialogue state tracking. In Proceedings of the ACL.Google Scholar
Cross Ref
Index Terms
Multi-domain Spoken Language Understanding Using Domain- and Task-aware Parameterization
Recommendations
Multi-domain spoken language understanding with transfer learning
This paper addresses the problem of multi-domain spoken language understanding (SLU) where domain detection and domain-dependent semantic tagging problems are combined. We present a transfer learning approach to the multi-domain SLU problem in which ...
Teaching compiler construction using a domain specific language
Building a compiler for a domain specific language (a language designed for a specific problem domain) can engage students more than traditional compiler course projects. Most students feel that compiler courses are irrelevant because they are not ...
Teaching compiler construction using a domain specific language
SIGCSE '05: Proceedings of the 36th SIGCSE technical symposium on Computer science educationBuilding a compiler for a domain specific language (a language designed for a specific problem domain) can engage students more than traditional compiler course projects. Most students feel that compiler courses are irrelevant because they are not ...






Comments