Abstract
Cross-lingual dialogue systems are increasingly important in e-commerce and customer service due to the rapid progress of globalization. In real-world system deployment, machine translation (MT) services are often used before and after the dialogue system to bridge different languages. However, noises and errors introduced in the MT process will result in the dialogue system's low robustness, making the system's performance far from satisfactory. In this article, we propose a novel MT-oriented noise enhanced framework that exploits multi-granularity MT noises and injects such noises into the dialogue system to improve the dialogue system's robustness. Specifically, we first design a method to automatically construct multi-granularity MT-oriented noises and multi-granularity adversarial examples, which contain abundant noise knowledge oriented to MT. Then, we propose two strategies to incorporate the noise knowledge: (i) Utterance-level adversarial learning and (ii) Knowledge-level guided method. The former adopts adversarial learning to learn a perturbation-invariant encoder, guiding the dialogue system to learn noise-independent hidden representations. The latter explicitly incorporates the multi-granularity noises, which contain the noise tokens and their possible correct forms, into the training and inference process, thus improving the dialogue system's robustness. Experimental results on three dialogue models, two dialogue datasets, and two language pairs have shown that the proposed framework significantly improves the performance of the cross-lingual dialogue system.
- Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, and Kai-Wei Chang. 2018. Generating natural language adversarial examples. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2890–2896.Google Scholar
Cross Ref
- He Bai, Yu Zhou, Jiajun Zhang, Liang Zhao, Mei-Yuh Hwang, and Chengqing Zong. 2018. Source critical reinforcement learning for transferring spoken language understanding to a new language. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 3597–3607.Google Scholar
- He Bai, Yu Zhou, Jiajun Zhang, and Chengqing Zong. 2019. Memory consolidation for contextual spoken language understanding with dialogue logistic inference. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5448–5453.Google Scholar
Cross Ref
- Yonatan Belinkov and Yonatan Bisk. 2018. Synthetic and natural noise both break neural machine translation. In Proceedings of the International Conference on Learning Representations (ICLR'18).Google Scholar
- Antoine Bordes and Jason Weston. 2017. Learning end-to-end goal-oriented dialog. In Proceedings of the International Conference on Learning Representations (ICLR'17).Google Scholar
- Marcos Calvo, Fernando García, Lluís-F. Hurtado, Santiago Jiménez, and Emilio Sanchis. 2013. Exploiting multiple hypotheses for multilingual spoken language understanding. In Proceedings of the 17th Conference on Computational Natural Language Learning. Association for Computational Linguistics, 193–201.Google Scholar
- Marcos Calvo, Lluís-Felip Hurtado, Fernando Garcia, Emilio Sanchis, and Encarna Segarra. 2016. Multilingual spoken language understanding using graphs and multiple translations. Computer Speech & Language 38 (2016), 86–103. https://doi.org/10.1016/j.csl.2016.01.00 Google Scholar
Digital Library
- Wenhu Chen, Jianshu Chen, Yu Su, Xin Wang, Dong Yu, Xifeng Yan, and William Yang Wang. 2018. XL-NBT: A cross-lingual neural belief tracking framework. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 414–424.Google Scholar
Cross Ref
- Yun-Nung Chen, Dilek Hakkani-Tür, Gokhan Tur, Jianfeng Gao, and Li Deng. 2016. End-to-end memory networks with knowledge carryover for multi-turn spoken language understanding. In Proceedings of the Conference of the International Speech Communication Association (Interspeech'16). 3245–3249.Google Scholar
Cross Ref
- Yong Cheng, Lu Jiang, and Wolfgang Macherey. 2019. Robust neural machine translation with doubly adversarial inputs. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 4324–4333.Google Scholar
Cross Ref
- Yong Cheng, Zhaopeng Tu, Fandong Meng, Junjie Zhai, and Yang Liu. 2018. Towards robust neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1756–1766.Google Scholar
Cross Ref
- Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White-box adversarial examples for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 31–36.Google Scholar
Cross Ref
- Mihail Eric, Lakshmi Krishnan, Francois Charette, and Christopher D. Manning. 2017. Key-value retrieval networks for task-oriented dialogue. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. Association for Computational Linguistics, Saarbrücken,Germany, 37–49.Google Scholar
- Mihail Eric and Christopher Manning. 2017. A Copy-augmented sequence-to-sequence architecture gives good performance on task-oriented dialogue. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. Association for Computational Linguistics, 468–473.Google Scholar
Cross Ref
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2672–2680. Google Scholar
Digital Library
- Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of the 3rd International Conference on Learning Representations (ICLR'15).Google Scholar
- Xisen Jin, Wenqiang Lei, Zhaochun Ren, Hongshen Chen, Shangsong Liang, Yihong Zhao, and Dawei Yin. 2018. Explicit state tracking with semi-supervisionfor neural dialogue generation. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 1403–1412. Google Scholar
Digital Library
- Vladimir Karpukhin, Omer Levy, Jacob Eisenstein, and Marjan Ghazvininejad. 2019. Training on synthetic noise improves robustness to natural noise in machine translation. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT'19). Association for Computational Linguistics, Hong Kong, China, 42–47.Google Scholar
Cross Ref
- Sungjin Lee and Amanda Stent. 2016. Task Lineages: Dialog state tracking for flexible interaction. In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, 11–21.Google Scholar
Cross Ref
- Wenqiang Lei, Xisen Jin, Min-Yen Kan, Zhaochun Ren, Xiangnan He, and Dawei Yin. 2018. Sequicity: Simplifying task-oriented dialogue systems with single sequence-to-sequence architectures. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1437–1447.Google Scholar
Cross Ref
- Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, and Dan Jurafsky. 2017. Adversarial learning for neural dialogue generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2157–2169.Google Scholar
Cross Ref
- Xiujun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao, and Asli Celikyilmaz. 2017. End-to-end task-completion neural dialogue systems. In Proceedings of the 8th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, Taipei, Taiwan, 733–743.Google Scholar
- Bing Liu and Ian Lane. 2018. End-to-end learning of task-oriented dialogs. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop. Association for Computational Linguistics, 67–73.Google Scholar
Cross Ref
- Zihan Liu, Jamin Shin, Yan Xu, Genta Indra Winata, Peng Xu, Andrea Madotto, and Pascale Fung. 2019. Zero-shot cross-lingual dialogue systems with transferable latent variables. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP'19). Association for Computational Linguistics, 1297–1303.Google Scholar
Cross Ref
- Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, and Pascale Fung. 2020. Attention-informed mixed-language training for zero-shot cross-lingual task-oriented dialogue systems. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI'20), the 32nd Innovative Applications of Artificial Intelligence Conference (IAAI'20), and the 10th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI'20). 8433–8440.Google Scholar
- Zihan Liu, Genta Indra Winata, Peng Xu, Zhaojiang Lin, and Pascale Fung. 2020. Cross-lingual Spoken language understanding with regularized representation alignment. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP'20). Association for Computational Linguistics, 7241–7251.Google Scholar
Cross Ref
- Andrea Madotto, Chien-Sheng Wu, and Pascale Fung. 2018. Mem2Seq: Effectively incorporating knowledge bases into end-to-end task-oriented dialog systems. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1468–1478.Google Scholar
Cross Ref
- Shikib Mehri, Tejas Srinivasan, and Maxine Eskenazi. 2019. Structured fusion networks for dialog. In Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. Association for Computational Linguistics, 165–177.Google Scholar
Cross Ref
- Gregoire Mesnil, Yann Dauphin, Kaisheng Yao, Yoshua Bengio, and Geoffrey Zweig. 2015. Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Aud. Speech Lang. Process. 23, 3 (2015), 530–539. Google Scholar
Digital Library
- Takeru Miyato, Andrew M. Dai, and Ian Goodfellow. 2017. Adversarial training methods for semi-supervised text classification. In Proceedings of the International Conference on Learning Representations (ICLR'17).Google Scholar
- Tong Niu and Mohit Bansal. 2018. Adversarial Over-sensitivity and over-stability strategies for dialogue models. In Proceedings of the 22nd Conference on Computational Natural Language Learning. Association for Computational Linguistics, 486–496.Google Scholar
Cross Ref
- Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Comput. Ling. 29, 1 (2003), 19–51. Google Scholar
Digital Library
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 311–318. Google Scholar
Digital Library
- Libo Qin, Minheng Ni, Yue Zhang, and Wanxiang Che. 2020. CoSDA-ML: Multi-lingual code-switching data augmentation for zero-shot cross-lingual NLP. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI'20). 3853–3860.Google Scholar
- Sebastian Schuster, Sonal Gupta, Rushin Shah, and Mike Lewis. 2019. Cross-lingual transfer learning for multilingual task oriented dialog. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 3795–3805.Google Scholar
Cross Ref
- Shikhar Sharma, Jing He, Kaheer Suleman, Hannes Schulz, and Philip Bachman. 2017. Natural Language generation in dialogue using lexicalized and delexicalized data. arXiv:1606.03632. Retrieved from https://arxiv.org/abs/1606.03632.Google Scholar
- Lei Shu, Piero Molino, Mahdi Namazifar, Hu Xu, Bing Liu, Huaixiu Zheng, and Gokhan Tur. 2019. Flexibly-structured model for task-oriented dialogues. In Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. Association for Computational Linguistics, 178–187.Google Scholar
Cross Ref
- Pei-Hao Su, Milica Gašić, Nikola Mrkšić, Lina M. Rojas-Barahona, Stefan Ultes, David Vandyke, Tsung-Hsien Wen, and Steve Young. 2016. On-line active reward learning for policy optimisation in spoken dialogue systems. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2431–2441.Google Scholar
Cross Ref
- Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations (ICLR'14).Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 5998–6008. Google Scholar
Digital Library
- Weikang Wang, Jiajun Zhang, Qian Li, Mei-Yuh Hwang, Chengqing Zong, and Zhifei Li. 2019. Incremental learning from scratch for task-oriented dialogue systems. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 3710–3720.Google Scholar
Cross Ref
- Weikang Wang, Jiajun Zhang, Qian Li, Chengqing Zong, and Zhifei Li. 2019. Are you for real? detecting identity fraud via dialogue interactions. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP'19). Association for Computational Linguistics, 1762–1771.Google Scholar
Cross Ref
- Weikang Wang, Jiajun Zhang, Han Zhang, Mei-Yuh Hwang, Chengqing Zong, and Zhifei Li. 2018. A teacher-student framework for maintainable dialog manager. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3803–3812.Google Scholar
Cross Ref
- Tsung-Hsien Wen, Milica Gašić, Nikola Mrkšić, Pei-Hao Su, David Vandyke, and Steve Young. 2015. Semantically conditioned lstm-based natural language generation for spoken dialogue systems. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1711–1721.Google Scholar
Cross Ref
- Tsung-Hsien Wen, David Vandyke, Nikola Mrkšić, Milica Gašić, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, and Steve Young. 2017. A network-based end-to-end trainable task-oriented dialogue system. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Association for Computational Linguistics, 438–449.Google Scholar
Cross Ref
- Jason D. Williams and Steve Young. 2007. Partially observable Markov decision processes for spoken dialog systems. Comput. Speech Lang. 21, 2 (2007), 393–422. Google Scholar
Digital Library
- Chien-Sheng Wu, Richard Socher, and Caiming Xiong. 2019. Global-to-local memory pointer networks for task-oriented dialogue. In Proceedings of the International Conference on Learning Representations (ICLR'19).Google Scholar
- Steve Young, Milica Gasic, Blaise Thomson, and Jason Williams. 2013. POMDP-based statistical spoken dialog systems: A review. Proc. IEEE 101 (05 2013), 1160–1179.Google Scholar
Cross Ref
- Yichi Zhang, Zhijian Ou, Huixin Wang, and Junlan Feng. 2020. A probabilistic end-to-end task-oriented dialog model with latent belief states towards semi-supervised learning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics Online, 9207–9219.Google Scholar
- Yichi Zhang, Zhijian Ou, and Zhou Yu. 2020. Task-oriented dialog systems that consider multiple appropriate responses under the same context. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI'20), the 32nd Innovative Applications of Artificial Intelligence Conference (IAAI'20), and the 10th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI'20). AAAI Press, 9604–9611.Google Scholar
- Zhengli Zhao, Dheeru Dua, and Sameer Singh. 2018. Generating natural adversarial examples. In Proceedings of the International Conference on Learning Representations (ICLR'18).Google Scholar
- Victor Zhong, Caiming Xiong, and Richard Socher. 2018. Global-locally self-attentive encoder for dialogue state tracking. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1458–1467.Google Scholar
Cross Ref
- Junnan Zhu, Qian Wang, Yining Wang, Yu Zhou, Jiajun Zhang, Shaonan Wang, and Chengqing Zong. 2019. NCLS: Neural cross-lingual summarization. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP'19). Association for Computational Linguistics, 3054–3064.Google Scholar
Cross Ref
Index Terms
Robust Cross-lingual Task-oriented Dialogue
Recommendations
Knowledge-aware Attentive Wasserstein Adversarial Dialogue Response Generation
Survey Paper and Regular PaperNatural language generation has become a fundamental task in dialogue systems. RNN-based natural response generation methods encode the dialogue context and decode it into a response. However, they tend to generate dull and simple responses. In this ...
Humor Utterance Generation for Non-task-oriented Dialogue Systems
HAI '15: Proceedings of the 3rd International Conference on Human-Agent InteractionWe propose a humor utterance generation method that is compatible with dialogue systems, to increase "desire of continuing dialogue". A dialogue system retrieves leading-item:noun pairs from Twitter as knowledge and attempts to select the most humorous ...






Comments