Abstract
Some natural languages belong to the same family or share similar syntactic and/or semantic regularities. This property persuades researchers to share computational models across languages and benefit from high-quality models to boost existing low-performance counterparts. In this article, we follow a similar idea, whereby we develop statistical and neural machine translation (MT) engines that are trained on one language pair but are used to translate another language. First we train a reliable model for a high-resource language, and then we exploit cross-lingual similarities and adapt the model to work for a close language with almost zero resources. We chose Turkish (Tr) and Azeri or Azerbaijani (Az) as the proposed pair in our experiments. Azeri suffers from lack of resources as there is almost no bilingual corpus for this language. Via our techniques, we are able to train an engine for the Az → English (En) direction, which is able to outperform all other existing models.
- Eleftherios Avramidis and Philipp Koehn. 2008. Enriching morphologically poor languages for statistical machine translation. In Proceeding of the the Annual Meeting of the Association for Computational Linguistics (ACL’08). 763--770.Google Scholar
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations.Google Scholar
- Yoshua Bengio. 2012. Deep learning of representations for unsupervised and transfer learning. In Proceedings of ICML Unsupervised and Transfer Learning Workshop. 17--36. Google Scholar
Digital Library
- Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, and Marcello Federico. 2016. Neural versus phrase-based machine translation quality: A case study. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 257--267.Google Scholar
Cross Ref
- Arianna Bisazza and Marcello Federico. 2009. Morphological pre-processing for Turkish to English statistical machine translation. In Proceedings of the 6th International Workshop on Spoken Language Translation (IWSLT’09). 129--135.Google Scholar
- Arianna Bisazza, Nick Ruiz, and Marcello Federico. 2011. Fill-up versus interpolation methods for phrase-based SMT adaptation. In Proceedings of the 8th International Workshop on Spoken Language Translation (IWSLT’11).Google Scholar
- Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder--decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1724--1734.Google Scholar
Cross Ref
- Junyoung Chung, Kyunghyun Cho, and Yoshua Bengio. 2016. A character-level decoder without explicit segmentation for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1693--1703.Google Scholar
Cross Ref
- Ilknur Durgar El-Kahlout and Kemal Oflazer. 2006. Initial explorations in English to Turkish statistical machine translation. In Proceedings of the Workshop on Statistical Machine Translation. 7--14. Google Scholar
Digital Library
- Ahmed El Kholy, Nizar Habash, Gregor Leusch, Evgeny Matusov, and Hassan Sawaf. 2013. Selective combination of pivot and direct statistical machine translation models. In Proceedings of the 6th International Joint Conference on Natural Language Processing. 1174--1180.Google Scholar
- Gülsen Eryigit and Eref Adali. 2004. An affix stripping morphological analyzer for turkish. In Proceedings of the IASTED International Conference on Artificial Intelligence and Applications. 299--304.Google Scholar
- Rauf Fatullayev, Ali Abbasov, and Abulfat Fatullayev. 2008. Dilmanc is the 1st MT system for azerbaijani. In Proceedings of the 2nd Swedish Language Technology Conference (SLTC’08). 63--64.Google Scholar
- Sharon Goldwater and David McClosky. 2005. Improving statistical MT through morphological analysis. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 676--683. Google Scholar
Digital Library
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780. Google Scholar
Digital Library
- Wenbin Jiang, Yajuan Lü, Liang Huang, and Qun Liu. 2015. Automatic adaptation of annotations. Comput. Linguist. 41, 1 (2015), 119--147. Google Scholar
Digital Library
- Bevan Jones, Jacob Andreas, Daniel Bauer, Karl Moritz Hermann, and Kevin Knight. 2012. Semantics-based machine translation with hyperedge replacement grammars. In Proceedings of the 24th International Conference on Computational Linguistics. 1359--1376.Google Scholar
- Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1700--1709.Google Scholar
- Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR’15).Google Scholar
- Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 388--395.Google Scholar
- Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. 177--180. Google Scholar
Digital Library
- Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1. 48--54. Google Scholar
Digital Library
- Philipp Koehn and Josh Schroeder. 2007. Experiments in domain adaptation for statistical machine translation. In Proceedings of the 2nd Workshop on Statistical Machine Translation. 224--227. Google Scholar
Digital Library
- Pierre Lison and Jrg Tiedemann. 2016. OpenSubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 923--929.Google Scholar
- Antonio Valerio Miceli-Barone and Giuseppe Attardi. 2013. Pre-reordering for machine translation using transition-based walks on dependency parse trees. In Proceedings of the Eighth Workshop on Statistical Machine Translation. 162--167.Google Scholar
- RP Ñeco and Mikel L Forcada. 1996. Beyond mealy machines: Learning translators with recurrent neural networks. In Proceedings of the World Conference on Neural Networks. 408--411.Google Scholar
- Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics—Volume 1. 160--167. Google Scholar
Digital Library
- Kemal Oflazer and Ilknur Durgar El-Kahlout. 2007. Exploring different representational units in English-to-Turkish statistical machine translation. In Proceedings of the 2nd Workshop on Statistical Machine Translation. Prague, Czech Republic, 25--32. Google Scholar
Digital Library
- Kurtulus Öztopçu. 1993. A comparison of modern azeri with modern turkish. Azerbaijan Int. 1, 3 (1993).Google Scholar
- Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345--1359. Google Scholar
Digital Library
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 311--318. Google Scholar
Digital Library
- Holger Schwenk, Daniel Dchelotte, and Jean-Luc Gauvain. 2006. Continuous space language models for statistical machine translation. In Proceedings of the COLING/ACL on Main Conference Poster Sessions. 723--730. Google Scholar
Digital Library
- Andreas Stolcke. 2002. SRILM - An extensible language modeling toolkit. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP’02—INTERSPEECH).Google Scholar
- Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Confrence on Advances in Neural Information Processing Systems (NIPS’14). 3104--3112. Google Scholar
Digital Library
- Jörg Tiedemann. 2012. Parallel data, tools and interfaces in OPUS. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’12). 2214--2218.Google Scholar
- Dong Wang and Thomas Fang Zheng. 2015. Transfer learning for speech and language processing. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA’15). 1225--1237.Google Scholar
Cross Ref
- Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of 39th Annual Meeting of the Association for Computational Linguistics. Google Scholar
Digital Library
- Reyyan Yeniterzi and Kemal Oflazer. 2010. Syntax-to-morphology mapping in factored phrase-based statistical machine translation from English to Turkish. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 454--464. Google Scholar
Digital Library
- Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. 2016. Transfer learning for low-resource neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Cross Ref
Index Terms
Translating Low-Resource Languages by Vocabulary Adaptation from Close Counterparts
Recommendations
Multi-Round Transfer Learning for Low-Resource NMT Using Multiple High-Resource Languages
Neural machine translation (NMT) has made remarkable progress in recent years, but the performance of NMT suffers from a data sparsity problem since large-scale parallel corpora are only readily available for high-resource languages (HRLs). In recent ...
Leveraging Additional Resources for Improving Statistical Machine Translation on Asian Low-Resource Languages
Phrase-based machine translation (MT) systems require large bilingual corpora for training. Nevertheless, such large bilingual corpora are unavailable for most language pairs in the world, causing a bottleneck for the development of MT. For the Asian ...
Exploiting bilingual lexicons to improve multilingual embedding-based document and sentence alignment for low-resource languages
AbstractNeural machine translation systems trained on low-resource languages produce sub-optimal results due to the scarcity of large parallel datasets. To alleviate this problem, parallel corpora can be mined from the web. Two key tasks in a parallel ...






Comments