skip to main content
research-article

Metadial: A Meta-learning Approach for Arabic Dialogue Generation

Published:16 June 2023Publication History
Skip Abstract Section

Abstract

Dialogue generation is the automatic generation of a text response, given a user’s input. Dialogue generation for low-resource languages has been a challenging tasks for researchers. However, the advancements in deep learning models have made developing conversational agents that perform the tasks of dialogue generation not only possible, but also effective and helpful in many applications spanning a variety of domains. Nevertheless, work on conversational bots for low-resource languages such as the Arabic language is still limited due to various challenges, including the language structure, vocabulary, and the scarcity of its data resources. Meta-learning has been introduced before in the natural language processing (NLP) realm and showed significant improvements in many tasks; however, it has rarely been used in natural language generation (NLG) tasks and never in Arabic NLG. In this work, we propose a meta-learning approach for Arabic dialogue generation for fast adaptation on low-resource domains, namely, Arabic. We start by using existing pre-trained models; we then meta-learn the initial parameters on high-resource dataset before finetuning the parameters on the target tasks. We prove that the proposed model that employs meta-learning techniques improves generalization and enables fast adaptation of the transformer model on low-resource NLG tasks. We report gains in the BLEU-4 and improvements in Semantic textual Similarity (STS) metrics when compared to the existing state-of-the-art approach. We also do a further study on the effectiveness of the meta-learning algorithms on the response generation of the models.

REFERENCES

  1. [1] Ali Dana Abu and Habash Nizar. 2016. Botta: An Arabic dialect chatbot. In Proceedings of the 26th International Conference on Computational Linguistics: System Demonstrations. The COLING 2016 Organizing Committee, 208212. Retrieved from https://aclanthology.org/C16-2044.Google ScholarGoogle Scholar
  2. [2] Antoun Wissam, Baly Fady, and Hajj Hazem. 2020. AraBERT: Transformer-based model for Arabic language understanding. arXiv preprint arXiv:2003.00104 (2020).Google ScholarGoogle Scholar
  3. [3] Chen Zhiyu, Eavani Harini, Chen Wenhu, Liu Yinyin, and Wang William Yang. 2019. Few-shot NLG with pre-trained language model. arXiv preprint arXiv:1904.09521 (2019).Google ScholarGoogle Scholar
  4. [4] Cho Kyunghyun, Merriënboer Bart Van, Gulcehre Caglar, Bahdanau Dzmitry, Bougares Fethi, Schwenk Holger, and Bengio Yoshua. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).Google ScholarGoogle Scholar
  5. [5] Dale Edgar and Chall Jeanne S.. 1948. A formula for predicting readability: Instructions. Educ. Res. Bull. 27, 1–20 (1948), 37–54.Google ScholarGoogle Scholar
  6. [6] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarGoogle Scholar
  7. [7] Dou Zi-Yi, Yu Keyi, and Anastasopoulos Antonios. 2019. Investigating meta-learning algorithms for low-resource natural language understanding tasks. arXiv preprint arXiv:1908.10423 (2019).Google ScholarGoogle Scholar
  8. [8] El-Haj Mahmoud and Rayson Paul. 2016. OSMAN: A novel Arabic readability metric. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), Portorož, 250–255. https://aclanthology.org/L16-1038.Google ScholarGoogle Scholar
  9. [9] Fadhil Ahmed and AbuRa’ed Ahmed. 2019. Ollobot—Towards a text-based Arabic health conversational agent: Evaluation and results. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP’19). INCOMA Ltd., 295303. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Finn Chelsea, Abbeel Pieter, and Levine Sergey. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the International Conference on Machine Learning. PMLR, 11261135.Google ScholarGoogle Scholar
  11. [11] Gopalakrishnan Karthik, Hedayatnia Behnam, Chen Qinglang, Gottardi Anna, Kwatra Sanjeev, Venkatesh Anu, Gabriel Raefer, Hakkani-Tür Dilek, and AI Amazon Alexa. 2019. Topical-chat: Towards knowledge-grounded open-domain conversations. In Proceedings of the INTERSPEECH Conference. 18911895.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Helwe Chadi, Dib Ghassan, Shamas Mohsen, and Elbassuoni Shady. 2020. A semi-supervised BERT approach for Arabic named entity recognition. In Proceedings of the 5th Arabic Natural Language Processing Workshop. Association for Computational Linguistics, 4957. Retrieved from https://aclanthology.org/2020.wanlp-1.5.Google ScholarGoogle Scholar
  13. [13] Hijjawi Mohammad, Bandar Zuhair, Crockett Keeley, and Mclean David. 2014. ArabChat: An Arabic conversational agent. In Proceedings of the 6th International Conference on Computer Science and Information Technology (CSIT). 227237. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Holtzman Ari, Buys Jan, Du Li, Forbes Maxwell, and Choi Yejin. 2019. The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751 (2019).Google ScholarGoogle Scholar
  15. [15] Kincaid J. Peter, Jr Robert P. Fishburne, Rogers Richard L., and Chissom Brad S.. 1975. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease forMula) for Navy Enlisted Personnel. Technical Report. Naval Technical Training Command Millington, TN, Research Branch.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Koehn Philipp. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 388395.Google ScholarGoogle Scholar
  17. [17] Li Jiwei, Galley Michel, Brockett Chris, Gao Jianfeng, and Dolan Bill. 2016. A diversity-promoting objective function for neural conversation models. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 110119. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Li Jiwei, Galley Michel, Brockett Chris, Spithourakis Georgios P., Gao Jianfeng, and Dolan Bill. 2016. A persona-based neural conversation model. arXiv preprint arXiv:1603.06155 (2016).Google ScholarGoogle Scholar
  19. [19] Li Jiwei, Monroe Will, Shi Tianlin, Jean Sébastien, Ritter Alan, and Jurafsky Dan. 2017. Adversarial learning for neural dialogue generation. arXiv preprint arXiv:1701.06547 (2017).Google ScholarGoogle Scholar
  20. [20] Lin Zhaojiang, Xu Peng, Winata Genta Indra, Siddique Farhad Bin, Liu Zihan, Shin Jamin, and Fung Pascale. 2020. Caire: An end-to-end empathetic chatbot. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 1362213623.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Liu Chia-Wei, Lowe Ryan, Serban Iulian V., Noseworthy Michael, Charlin Laurent, and Pineau Joelle. 2016. How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. arXiv preprint arXiv:1603.08023 (2016).Google ScholarGoogle Scholar
  22. [22] Maurya Kaushal and Desarkar Maunendra. 2022. Meta-X\(_{NLG}\): A meta-learning approach based on language clustering for zero-shot cross-lingual transfer and generation. In Findings of the Association for Computational Linguistics. Association for Computational Linguistics, 269284. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Mi Fei, Huang Minlie, Zhang Jiyong, and Faltings Boi. 2019. Meta-learning for low-resource natural language generation in task-oriented dialogue systems. arXiv preprint arXiv:1905.05644 (2019).Google ScholarGoogle Scholar
  24. [24] Naous Tarek, Antoun Wissam, Mahmoud Reem, and Hajj Hazem. 2021. Empathetic BERT2BERT conversational model: Learning Arabic language generation with little data. In Proceedings of the 6th Arabic Natural Language Processing Workshop. Association for Computational Linguistics, 164172. Retrieved from https://aclanthology.org/2021.wanlp-1.17.Google ScholarGoogle Scholar
  25. [25] Naous Tarek, Hokayem Christian, and Hajj Hazem. 2020. Empathy-driven Arabic conversational chatbot. In Proceedings of the 5th Arabic Natural Language Processing Workshop. Association for Computational Linguistics, 5868. Retrieved from https://aclanthology.org/2020.wanlp-1.6.Google ScholarGoogle Scholar
  26. [26] Nichol Alex, Achiam Joshua, and Schulman John. 2018. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999 (2018).Google ScholarGoogle Scholar
  27. [27] Papineni Kishore, Roukos Salim, Ward Todd, and Zhu Wei-Jing. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Pudner Karen, Crockett Keeley A., and Bandar Zuhair. 2007. An intelligent conversational agent approach to extracting queries from natural language. In Proceedings of the World Congress on Engineering.Google ScholarGoogle Scholar
  29. [29] Qian Kun and Yu Zhou. 2019. Domain adaptive dialog generation via meta learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 26392649. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Rashkin Hannah, Smith Eric Michael, Li Margaret, and Boureau Y.-Lan. 2019. Towards empathetic open-domain conversation models: A new benchmark and dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 53705381. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Ravi Sachin and Larochelle Hugo. 2016. Optimization as a model for few-shot learning. International Conference on Learning Representations.Google ScholarGoogle Scholar
  32. [32] Ritter Alan, Cherry Colin, and Dolan William B.. 2011. Data-driven response generation in social media. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 583593. Retrieved from https://aclanthology.org/D11-1054.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Rothe Sascha, Narayan Shashi, and Severyn Aliaksei. 2020. Leveraging pre-trained checkpoints for sequence generation tasks. Trans. Assoc. Computat. Ling. 8 (2020), 264280.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Santoro Adam, Bartunov Sergey, Botvinick Matthew, Wierstra Daan, and Lillicrap Timothy. 2016. Meta-learning with memory-augmented neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, 18421850.Google ScholarGoogle Scholar
  35. [35] Serban Iulian V., Sankar Chinnadhurai, Germain Mathieu, Zhang Saizheng, Lin Zhouhan, Subramanian Sandeep, Kim Taesup, Pieper Michael, Chandar Sarath, Ke Nan Rosemary et al. 2017. A deep reinforcement learning chatbot. arXiv preprint arXiv:1709.02349 (2017).Google ScholarGoogle Scholar
  36. [36] Shin Jamin, Xu Peng, Madotto Andrea, and Fung Pascale. 2020. Generating empathetic responses by looking ahead the user’s sentiment. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 79897993.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Snell Jake, Swersky Kevin, and Zemel Richard. 2017. Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 30 (2017).Google ScholarGoogle Scholar
  38. [38] Song Yiping, Yan Rui, Li Xiang, Zhao Dongyan, and Zhang Ming. 2016. Two are better than one: An ensemble of retrieval-and generation-based dialog systems. arXiv preprint arXiv:1610.07149 (2016).Google ScholarGoogle Scholar
  39. [39] Thoppilan Romal, Freitas Daniel De, Hall Jamie, Shazeer Noam, Kulshreshtha Apoorv, Cheng Heng-Tze, Jin Alicia, Bos Taylor, Baker Leslie, Du Yu et al. 2022. LaMDA: Language models for dialog applications. arXiv preprint arXiv:2201.08239 (2022).Google ScholarGoogle Scholar
  40. [40] Tian Yusheng and Gorinski Philip John. 2020. Improving end-to-end speech-to-intent classification with Reptile. arXiv preprint arXiv:2008.01994 (2020).Google ScholarGoogle Scholar
  41. [41] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017).Google ScholarGoogle Scholar
  42. [42] Wallace Richard. 2003. The Elements of AIML Style. Vol. 139, New York, NY.Google ScholarGoogle Scholar
  43. [43] Wang Alex, Singh Amanpreet, Michael Julian, Hill Felix, Levy Omer, and Bowman Samuel R.. 2018. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461 (2018).Google ScholarGoogle Scholar
  44. [44] Weizenbaum Joseph. 1966. ELIZA—A computer program for the study of natural language communication between man and machine. Commun. ACM 9, 1 (1966), 3645.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Williams Jason D., Asadi Kavosh, and Zweig Geoffrey. 2017. Hybrid code networks: Practical and efficient end-to-end dialog control with supervised and reinforcement learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 665677. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Wolf Thomas, Debut Lysandre, Sanh Victor, Chaumond Julien, Delangue Clement, Moi Anthony, Cistac Pierric, Rault Tim, Louf Rémi, Funtowicz Morgan et al. 2019. HuggingFace’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019).Google ScholarGoogle Scholar
  47. [47] Wu Yu, Wu Wei, Xing Chen, Zhou Ming, and Li Zhoujun. 2016. Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. arXiv preprint arXiv:1612.01627 (2016).Google ScholarGoogle Scholar
  48. [48] Young Steve, Gašić Milica, Thomson Blaise, and Williams Jason D.. 2013. POMDP-based statistical spoken dialog systems: A review. Proc. IEEE 101, 5 (2013), 11601179.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Zhang Saizheng, Dinan Emily, Urbanek Jack, Szlam Arthur, Kiela Douwe, and Weston Jason. 2018. Personalizing dialogue agents: I have a dog, do you have pets too?arXiv preprint arXiv:1801.07243 (2018).Google ScholarGoogle Scholar
  50. [50] Zhu Yukun, Kiros Ryan, Zemel Rich, Salakhutdinov Ruslan, Urtasun Raquel, Torralba Antonio, and Fidler Sanja. 2015. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In Proceedings of the IEEE International Conference on Computer Vision. 1927.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Metadial: A Meta-learning Approach for Arabic Dialogue Generation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 6
        June 2023
        635 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3604597
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 16 June 2023
        • Online AM: 10 April 2023
        • Accepted: 13 March 2023
        • Received: 26 November 2022
        Published in tallip Volume 22, Issue 6

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)87
        • Downloads (Last 6 weeks)22

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!