skip to main content
research-article

An Improved English-to-Mizo Neural Machine Translation

Published:26 May 2021Publication History
Skip Abstract Section

Abstract

Machine Translation is an effort to bridge language barriers and misinterpretations, making communication more convenient through the automatic translation of languages. The quality of translations produced by corpus-based approaches predominantly depends on the availability of a large parallel corpus. Although machine translation of many Indian languages has progressively gained attention, there is very limited research on machine translation and the challenges of using various machine translation techniques for a low-resource language such as Mizo. In this article, we have implemented and compared statistical-based approaches with modern neural-based approaches for the English–Mizo language pair. We have experimented with different tokenization methods, architectures, and configurations. The performance of translations predicted by the trained models has been evaluated using automatic and human evaluation measures. Furthermore, we have analyzed the prediction errors of the models and the quality of predictions based on variations in sentence length and compared the model performance with the existing baselines.

References

  1. Benyamin Ahmadnia and Bonnie J. Dorr. 2019. Augmenting neural machine translation through round-trip training approach. Open Comput. Sci. 9, 1 (01 Jan. 2019), 268–278. DOI:https://doi.org/10.1515/comp-2019-0019Google ScholarGoogle Scholar
  2. B. Ahmadnia, P. Kordjamshidi, and G. Haffari. 2018. Neural machine translation advised by statistical machine translation: The case of farsi-spanish bilingually low-resource scenario. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA’18). 1209–1213. DOI:https://doi.org/doi: 10.1109/ICMLA.2018.00196Google ScholarGoogle Scholar
  3. Benyamin Ahmadnia, Javier Serrano, and Gholamreza Haffari. 2017. Persian-Spanish low-resource statistical machine translation through english as pivot language. In Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP’17). INCOMA Ltd., 24–30. DOI:https://doi.org/10.26615/978-954-452-049-6_004Google ScholarGoogle ScholarCross RefCross Ref
  4. Ebtesam H. Almansor and Ahmed Al-Ani. 2018. A hybrid neural machine translation technique for translating low resource languages. In Machine Learning and Data Mining in Pattern Recognition, Petra Perner (Ed.). Springer International Publishing, Cham, 347–356. DOI:https://doi.org/10.1007/978-3-319-96133-0_26Google ScholarGoogle Scholar
  5. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. [n.d.]. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473. Retrieved from https://arxiv.org/abs/1409.0473.Google ScholarGoogle Scholar
  6. Jereemi Bentham, Partha Pakray, Goutam Majumder, Sunday Lalbiaknia, and Alexander Gelbukh. 2016. Identification of rules for recognition of named entity classes in Mizo language. In Proceedings of the 2016 15th Mexican International Conference on Artificial Intelligence (MICAI’16). IEEE, 8–13. DOI:https://doi.org/10.1109/MICAI-2016.2016.00010Google ScholarGoogle ScholarCross RefCross Ref
  7. Yun Chen, Yang Liu, Yong Cheng, and Victor O.K. Li. 2017. A teacher-student framework for zero-resource neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1925–1935. DOI:https://doi.org/10.18653/v1/P17-1176Google ScholarGoogle Scholar
  8. L. Chhangte. 1993. Mizo Syntax. Ph.D. Dissertation. University of Oregon, Eugene.Google ScholarGoogle Scholar
  9. Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). Association for Computational Linguistics, 1724–1734. DOI:https://doi.org/10.3115/v1/D14-1179Google ScholarGoogle ScholarCross RefCross Ref
  10. Michael Denkowski and Alon Lavie. 2014. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the 9th Workshop on Statistical Machine Translation. Association for Computational Linguistics, 376–380. DOI:https://doi.org/10.3115/v1/W14-3348Google ScholarGoogle ScholarCross RefCross Ref
  11. Bonnie J Dorr, E. Hovy, and L. Levin. 2006. Machine Translation: Interlingual Methods. Elsevier, Oxford, 383–394. Google ScholarGoogle Scholar
  12. Indranil Dutta, Irfan S., Pamir Gogoi, and Priyankoo Sarmah. 2017. Nature of contrast and coarticulation: Evidence from Mizo tones and Assamese vowel harmony. In Proceedings of the Conference of the International Speech Communication Association (Interspeech’17). ISCA. DOI:https://doi.org/10.21437/interspeech.2017-1304Google ScholarGoogle ScholarCross RefCross Ref
  13. Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning, Doina Precup and Yee Whye Teh (Eds.), Vol. 70. International Convention Centre,1126–1135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Orhan Firat, Baskaran Sankaran, Yaser Al-onaizan, Fatos T. Yarman Vural, and Kyunghyun Cho. 2016. Zero-resource translation with multi-lingual neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 268–277. DOI:https://doi.org/10.18653/v1/D16-1026Google ScholarGoogle ScholarCross RefCross Ref
  15. Parismita Gogoi, Abhishek Dey, Wendy Lalhminghlui, Priyankoo Sarmah, and S. R. Mahadeva Prasanna. 2020. Lexical tone recognition in Mizo using acoustic-prosodic features. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, 6458–6461. Google ScholarGoogle Scholar
  16. Jiatao Gu, Hany Hassan, Jacob Devlin, and Victor O. K. Li. 2018. Universal neural machine translation for extremely low resource languages. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, 344–354. DOI:https://doi.org/10.18653/v1/N18-1032Google ScholarGoogle Scholar
  17. Jiatao Gu, Yong Wang, Yun Chen, Victor O. K. Li, and Kyunghyun Cho. 2018. Meta-learning for low-resource neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3622–3631. DOI:https://doi.org/10.18653/v1/D18-1398Google ScholarGoogle ScholarCross RefCross Ref
  18. Çaglar Gülçehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loïc Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2015. On using monolingual corpora in neural machine translation. arxiv:1503.03535. Retrieved from https://arxiv.org/abs/1503.03535.Google ScholarGoogle Scholar
  19. Francisco Guzmán, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, and Marc’Aurelio Ranzato. 2019. The FLORES evaluation datasets for low-resource machine translation: Nepali–english and Sinhala–english. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 6097–6110. DOI:https://doi.org/10.18653/v1/D19-1632Google ScholarGoogle ScholarCross RefCross Ref
  20. Annette Hautli-Janisz. 2015. Pushpak Bhattacharyya: Machine translation. Mach. Transl. 29, 3 (01 Dec. 2015), 285–289. DOI:https://doi.org/10.1007/s10590-015-9170-7 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Di He, Yingce Xia, Tao Qin, Liwei Wang, Nenghai Yu, Tie-Yan Liu, and Wei-Ying Ma. 2016. Dual learning for machine translation. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS’16). Curran Associates Inc., 820–828. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. William John Hutchins and Harold L. Somers. 1992. An Introduction to Machine Translation. Vol. 362. Academic Press, London.Google ScholarGoogle Scholar
  23. Inigo Jauregi Unanue, Lierni Garmendia Arratibel, Ehsan Zare Borzeshi, and Massimo Piccardi. 2018. English-Basque statistical and neural machine translation. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC’18). European Language Resources Association (ELRA).Google ScholarGoogle Scholar
  24. Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2017. Google’s multilingual neural machine translation system: Enabling zero-shot translation. Trans. Assoc. Comput. Ling. 5 (2017), 339–351. DOI:https://doi.org/10.1162/tacl_a_00065Google ScholarGoogle ScholarCross RefCross Ref
  25. Laltluangliana Khiangte. 2008. Mizos of North-east India: An Introduction to Mizo Culture, Folklore, Language & Literature. LTL Publications.Google ScholarGoogle Scholar
  26. Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, and Alexander Rush. 2017. OpenNMT: Open-source toolkit for neural machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’17). Association for Computational Linguistics, 67–72.Google ScholarGoogle ScholarCross RefCross Ref
  27. Philipp Koehn. 2010. Statistical Machine Translation (1st ed.). Cambridge University Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondřej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. Association for Computational Linguistics, 177–180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Candy Lalrempuii and Badal Soni. 2020. Attention-based english to Mizo neural machine translation. In Machine Learning, Image Processing, Network Security and Data Sciences. Springer Singapore, Singapore, 193–203. Google ScholarGoogle Scholar
  30. Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Phrase-based & neural unsupervised machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 5039–5049. DOI:https://doi.org/10.18653/v1/D18-1549Google ScholarGoogle ScholarCross RefCross Ref
  31. Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.Google ScholarGoogle Scholar
  32. Sainik Kumar Mahata, Soumil Mandal, Dipankar Das, and Sivaji Bandyopadhyay. 2018. Smt vs nmt: A comparison over hindi & bengali simple sentences. arXiv:arXiv:1812.04898. Retrieved from https://arxiv.org/abs1812.04898.Google ScholarGoogle Scholar
  33. Goutam Majumder, Partha Pakray, Zoramdinthara Khiangte, and Alexander Gelbukh. 2018. Multiword expressions (MWE) for Mizo language: Literature survey. In Computational Linguistics and Intelligent Text Processing, Alexander Gelbukh (Ed.). Springer International Publishing, Cham, 623–635. DOI:https://doi.org/10.1007/978-3-319-75477-2_45Google ScholarGoogle Scholar
  34. Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Comput. Ling. 29, 1 (2003), 19–51. DOI:https://doi.org/10.1162/089120103321337421 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Partha Pakray, Arunagshu Pal, Goutam Majumder, and Alexander Gelbukh. 2015. Resource building and parts-of-speech (pos) tagging for the mizo language. In Proceedings of the 2015 14th Mexican International Conference on Artificial Intelligence (MICAI’15). IEEE, 3–7. DOI:https://doi.org/10.1109/MICAI.2015.7 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02). Association for Computational Linguistics, Stroudsburg, 311–318. DOI:https://doi.org/10.3115/1073083.1073135 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Amarnath Pathak and Partha Pakray. 2019. Neural machine translation for Indian languages. J. Intell. Syst. 28, 3 (2019), 465–477. DOI:https://doi.org/10.1515/jisys-2018-0065Google ScholarGoogle ScholarCross RefCross Ref
  38. Amarnath Pathak, Partha Pakray, and Jereemi Bentham. 2019. English–Mizo machine translation using neural and statistical approaches. Neural Comput. Appl. 31, 11 (01 Nov 2019), 7615–7631. DOI:https://doi.org/10.1007/s00521-018-3601-3Google ScholarGoogle Scholar
  39. Sree Harsha Ramesh and Krishna Prasad Sankaranarayanan. 2018. Neural machine translation for low resource languages using bilingual lexicon induced from comparable corpora. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop. Association for Computational Linguistics,112–119. DOI:https://doi.org/10.18653/v1/N18-4016Google ScholarGoogle ScholarCross RefCross Ref
  40. Sandeep Saini and Vineet Sahula. 2015. A survey of machine translation techniques and systems for indian languages. In Proceedings of the 2015 IEEE International Conference on Computational Intelligence & Communication Technology. 676–681. DOI:https://doi.org/doi: 10.1109/CICT.2015.123Google ScholarGoogle ScholarCross RefCross Ref
  41. Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1715–1725. DOI:https://doi.org/10.18653/v1/P16-1162Google ScholarGoogle ScholarCross RefCross Ref
  42. Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the Association for Machine Translation in the Americas, Vol. 200. Cambridge, MA, 223–231.Google ScholarGoogle Scholar
  43. Harold Somers. 1999. Review article: Example-based machine translation. Mach. Transl. 14, 2 (1999), 113–157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Andreas Stolcke. 2004. Srilm—An extensible language modeling toolkit. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP’02).Google ScholarGoogle Scholar
  45. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2 (NIPS’14). MIT Press, Cambridge, MA, 3104–3112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Sneha Tripathi and Juran Sarkhel. 2011. Approaches to machine translation. Ann. Libr. Inf. Stud. 57, 4 (01 2011), 388–393. Google ScholarGoogle Scholar
  47. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates Inc., Red Hook, NY, 6000–6010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Biao Zhang, Deyi Xiong, Jinsong Su, and Hong Duan. 2017. A context-aware recurrent encoder for neural machine translation. IEEE/ACM Trans. Aud. Speech Lang. Process. 25, 12 (2017), 2424–2432. DOI:https://doi.org/doi: 10.1109/TASLP.2017.2751420 Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Hao Zheng, Yong Cheng, and Yang Liu. 2017. Maximum expected likelihood estimation for zero-resource neural machine translation. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17). AAAI Press, 4251–4257. DOI:https://doi.org/10.24963/ijcai.2017/594 Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. 2016. Transfer learning for low-resource neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1568–1575. DOI:https://doi.org/10.18653/v1/D16-1163Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. An Improved English-to-Mizo Neural Machine Translation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 4
      July 2021
      419 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3465463
      Issue’s Table of Contents

      Copyright © 2021 Copyright held by the owner/author(s). Publication rights licensed to ACM.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 May 2021
      • Accepted: 1 December 2020
      • Revised: 1 September 2020
      • Received: 1 November 2019
      Published in tallip Volume 20, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!