Abstract
Learning response generation models constitute the main component of building open-domain dialogue systems. However, training open-domain response generation models requires large amounts of labeled data and pre-trained language generation models that are often nonexistent for low-resource languages. In this article, we propose a framework for training open-domain response generation models in low-resource settings. We consider Dialectal Arabic (DA) as a working example. The framework starts by warm-starting a transformer-based encoder-decoder with pre-trained language model parameters. Next, the resultant encoder-decoder model is adapted to DA by employing self-supervised pre-training on large-scale unlabeled data in the desired dialect. Finally, the model is fine-tuned on a very small labeled dataset for open-domain response generation. The results show significant performance improvements on three spoken Arabic dialects after adopting the framework’s three stages, highlighted by higher BLEU and lower Perplexity scores compared with multiple baseline models. Specifically, our models are capable of generating fluent responses in multiple dialects with an average human-evaluated fluency score above 4. Our data is made publicly available.
- [1] . 2016. Farasa: A fast and furious segmenter for Arabic. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. 11–16.Google Scholar
Cross Ref
- [2] . 2020. ARBERT & MARBERT: Deep bidirectional transformers for Arabic. arXiv preprint arXiv:2101.01785 (2020).Google Scholar
- [3] . 2020. NADI 2020: The first nuanced Arabic dialect identification shared task. In Proceedings of the 5th Arabic Natural Language Processing Workshop. 97–110.Google Scholar
- [4] . 2020. Beyond geolocation: Micro-dialect identification in diaglossic and code-switched environments. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 5855–5876.Google Scholar
Cross Ref
- [5] . 2020. Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977 (2020).Google Scholar
- [6] . 2016. Botta: An Arabic dialect chatbot. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations. 208–212.Google Scholar
- [7] . 2020. AraBERT: Transformer-based model for Arabic language understanding. In Proceedings of the LREC 2020 Workshop Language Resources and Evaluation Conference (11–16 May 2020). 9.Google Scholar
- [8] . 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations, (ICLR 2015).Google Scholar
- [9] . 2020. Mirostat: A neural text decoding algorithm that directly controls perplexity. In Proceedings of the International Conference on Learning Representations.Google Scholar
- [10] . 2019. The MADAR shared task on Arabic fine-grained dialect identification. In Proceedings of the 4th Arabic Natural Language Processing Workshop. 199–207.Google Scholar
Cross Ref
- [11] . 2018. Wizard of Wikipedia: Knowledge-powered conversational agents. In Proceedings of the International Conference on Learning Representations.Google Scholar
- [12] . 2013. Processing spontaneous orthography. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 585–595.Google Scholar
- [13] . 2019. OlloBot-towards a text-based Arabic health conversational agent: Evaluation and results. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019). 295–303.Google Scholar
Cross Ref
- [14] . 2020. A survey on recent approaches for natural language processing in low-resource scenarios. arXiv preprint arXiv:2010.12309 (2020).Google Scholar
- [15] . 2014. Towards an open-domain conversational system fully based on natural language processing. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 928–939.Google Scholar
- [16] . 2020. Challenges in building intelligent open-domain dialog systems. ACM Transactions on Information Systems (TOIS) 38, 3 (2020), 1–32.Google Scholar
Digital Library
- [17] Daphne Ippolito, Reno Kriz, Maria Kustikova, João Sedoc, and Chris Callison-Burch. 2018. Comparison of diverse decoding methods from conditional language models. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics.Google Scholar
- [18] . 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7871–7880.Google Scholar
Cross Ref
- [19] . 2017. DailyDialog: A manually labelled multi-turn dialogue dataset. In Proceedings of the 8th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 986–995.Google Scholar
- [20] . 2020. Caire: An end-to-end empathetic chatbot. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 13622–13623.Google Scholar
Cross Ref
- [21] . 2022. Linguistically driven multi-task pre-training for low-resource neural machine translation. ACM Transactions on Asian and Low-Resource Language Information Processing 21, 4 (2022), 1–29.Google Scholar
Digital Library
- [22] . 2018. Improving NER tagging performance in low-resource languages via multilingual learning. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 2 (2018), 1–20.Google Scholar
- [23] . 2021. Empathetic BERT2BERT conversational model: Learning Arabic language generation with little data. In Proceedings of the 6th Arabic Natural Language Processing Workshop. 164–172.Google Scholar
- [24] . 2020. Empathy-driven Arabic conversational chatbot. In Proceedings of the 5th Arabic Natural Language Processing Workshop. 58–68.Google Scholar
- [25] . 2020. Conversational question answering in low resource scenarios: A dataset and case study for Basque. In Proceedings of the 12th Language Resources and Evaluation Conference. 436–442.Google Scholar
- [26] . 2021. Multilingual offensive language identification for low-resource languages. Transactions on Asian and Low-Resource Language Information Processing 21, 1 (2021), 1–13.Google Scholar
Digital Library
- [27] . 2019. Towards empathetic open-domain conversation models: A new benchmark and dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5370–5381.Google Scholar
Cross Ref
- [28] . 2020. Open-domain conversational agents: Current progress, open problems, and future directions. arXiv preprint arXiv:2006.12442 (2020).Google Scholar
- [29] . 2021. Recipes for building an open-domain chatbot. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 300–325.Google Scholar
Cross Ref
- [30] . 2020. Leveraging pre-trained checkpoints for sequence generation tasks. Transactions of the Association for Computational Linguistics 8 (2020), 264–280.Google Scholar
Cross Ref
- [31] . 2018. From Eliza to XiaoIce: Challenges and opportunities with social chatbots. Frontiers of Information Technology & Electronic Engineering 19, 1 (2018), 10–26.Google Scholar
Cross Ref
- [32] . 2017. Attention is all you need. Advances in Neural Information Processing Systems 30 (2017), 5998–6008.Google Scholar
- [33] Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. 2020. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 38–45.Google Scholar
Cross Ref
- [34] . 2021. Robust cross-lingual task-oriented dialogue. ACM Transactions on Asian and Low-Resource Language Information Processing 20, 6 (2021), 1–24.Google Scholar
Digital Library
- [35] . 2021. mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 483–498.Google Scholar
Cross Ref
- [36] Ze Yang, Wei Wu, Jian Yang, Can Xu, and Zhoujun Li. 2019. Low-resource response generation with template prior. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 1886–1897.Google Scholar
Cross Ref
- [37] . 2018. Personalizing dialogue agents: I have a dog, do you have pets too?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2204–2213.Google Scholar
Cross Ref
- [38] . 2020. DIALOGPT: Large-scale generative pre-training for conversational response generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 270–278.Google Scholar
Cross Ref
- [39] . 2020. Towards persona-based empathetic conversational models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 6556–6566.Google Scholar
Cross Ref
- [40] . 2020. The design and implementation of xiaoice, an empathetic social chatbot. Computational Linguistics 46, 1 (2020), 53–93.Google Scholar
Digital Library
Index Terms
Open-Domain Response Generation in Low-Resource Settings using Self-Supervised Pre-Training of Warm-Started Transformers
Recommendations
Reflecting on Experiences for Response Generation
MM '22: Proceedings of the 30th ACM International Conference on MultimediaMultimodal dialogue systems attract much attention recently, but they are far from skills like: 1) automatically generate context- specific responses instead of safe but general responses; 2) naturally coordinate between the different information ...
Unsupervised Cross-Domain Adaptation for Response Selection Using Self-Supervised and Adversarial Training
WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data MiningRecently, many neural context-response matching models have been developed for retrieval-based dialogue systems. Although existing models achieve impressive performance through learning on a large amount of in-domain parallel dialogue data, they usually ...
Self-Supervised Pre-Training for Attention-Based Encoder-Decoder ASR Model
End-to-end (E2E) models, including the attention-based encoder-decoder (AED) models, have achieved promising performance on the automatic speech recognition (ASR) task. However, the supervised training process of the E2E model needs a large amount of ...






Comments