Abstract
Grapheme-to-phoneme models are key components in automatic speech recognition and text-to-speech systems. With low-resource language pairs that do not have available and well-developed pronunciation lexicons, grapheme-to-phoneme models are particularly useful. These models are based on initial alignments between grapheme source and phoneme target sequences. Inspired by sequence-to-sequence recurrent neural network--based translation methods, the current research presents an approach that applies an alignment representation for input sequences and pretrained source and target embeddings to overcome the transliteration problem for a low-resource languages pair. Evaluation and experiments involving French and Vietnamese showed that with only a small bilingual pronunciation dictionary available for training the transliteration models, promising results were obtained with a large increase in BLEU scores and a reduction in Translation Error Rate (TER) and Phoneme Error Rate (PER). Moreover, we compared our proposed neural network--based transliteration approach with a statistical one.
- Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, et al. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv preprint arXiv:1605.02688 472 (2016), 473.Google Scholar
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google Scholar
- Maria João Barros and Christian Weiss. 2006. Maximum entropy motivated grapheme-to-phoneme, stress and syllable boundary prediction for Portuguese text-to-speech. IV Jornadas en Tecnologías del Habla. Zaragoza, Spain (2006), 177--182.Google Scholar
- Maximilian Bisani and Hermann Ney. 2008. Joint-sequence models for grapheme-to-phoneme conversion. Speech Communication 50, 5 (2008), 434--451. Google Scholar
Digital Library
- Nam X. Cao, Nhut M. Pham, and Quan H. Vu. 2010. Comparative analysis of transliteration techniques based on statistical machine translation and joint-sequence model. In Proceedings of the 2010 Symposium on Information and Communication Technology. Association for Computing Machinery, 59--63. Google Scholar
Digital Library
- Stanley F. Chen et al. 2003. Conditional and joint models for grapheme-to-phoneme conversion. In Proceedings of the Interspeech Geneva. 2033--2036.Google Scholar
- Jonathan H. Clark, Chris Dyer, Alon Lavie, and Noah A. Smith. 2011. Better hypothesis testing for statistical machine translation: Controlling for optimizer instability. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers-Volume 2. Association for Computational Linguistics, 176--181. Google Scholar
Digital Library
- Sabine Deligne, Francois Yvon, and Frédéric Bimbot. 1995. Variable-length sequence matching for phonetic transcription using joint multigrams. In Proceedings of the 4th European Conference on Speech Communication and Technology. 2243--2246.Google Scholar
- Xiangyu Duan, Rafael E. Banchs, Min Zhang, Haizhou Li, and A. Kumaran. 2016. Report of NEWS 2016 machine transliteration shared task. ACL 2016 (2016), 58--72.Google Scholar
- Andrew Finch, Lemao Liu, Xiaolin Wang, and Eiichiro Sumita. 2016. Target-bidirectional neural models for machine transliteration. ACL 2016 (2016), 78--82.Google Scholar
Cross Ref
- Andrew Finch and Eiichiro Sumita. 2010. Transliteration using a phrase-based statistical machine translation system to re-score the output of a joint multigram model. In Proceedings of the 2010 Named Entities Workshop. Association for Computational Linguistics, 48--52. Google Scholar
Digital Library
- Orhan Firat, Baskaran Sankaran, Yaser Al-Onaizan, Fatos T. Yarman Vural, and Kyunghyun Cho. 2016. Zero-resource translation with multi-lingual neural machine translation. arXiv preprint arXiv:1606.04164 (2016).Google Scholar
- Qin Gao and Stephan Vogel. 2008. Parallel implementations of word alignment tool. In Software Engineering, Testing, and Quality Assurance for Natural Language Processing. Association for Computational Linguistics, 49--57. Google Scholar
Digital Library
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (Nov. 1997), 1735--1780. Google Scholar
Digital Library
- Sebastien Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio. 2014. On using very large target vocabulary for neural machine translation. arXiv preprint arXiv:1412.2007 (2014).Google Scholar
- Sittichai Jiampojamarn, Grzegorz Kondrak, and Tarek Sherif. 2007. Applying many-to-many alignments and hidden Markov models to letter-to-phoneme conversion. In HLT-NAACL, Vol. 7. 372--379.Google Scholar
- Sarvnaz Karimi, Falk Scholer, and Andrew Turpin. 2011. Machine transliteration survey. ACM Computing Surveys (CSUR) 43, 3 (2011), 17. Google Scholar
Digital Library
- Alexandre Klementiev and Dan Roth. 2006. Weakly supervised named entity transliteration and discovery from multilingual comparable corpora. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 817--824. Google Scholar
Digital Library
- Kevin Knight and Jonathan Graehl. 1998. Machine transliteration. Computational Linguistics 24, 4 (1998), 599--612. Google Scholar
Digital Library
- Philipp Koehn. 2017. Neural machine translation. arXiv preprint arXiv:1709.07809 (2017).Google Scholar
- Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, et al. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. Association for Computational Linguistics, 177--180. Google Scholar
Digital Library
- A. Kumaran, Mitesh M. Khapra, and Haizhou Li. 2010. Report of NEWS 2010 transliteration mining shared task. In Proceedings of the 2010 Named Entities Workshop. Association for Computational Linguistics, 21--28. Google Scholar
Digital Library
- Antoine Laurent, Paul Deléglise, Sylvain Meignier, and France Spécinov-Trélazé. 2009. Grapheme to phoneme conversion using an SMT system. In Proceedings of INTERSPEECH, ISCA. 708--711.Google Scholar
- Ngoc Tan Le and Fatiha Sadat. 2017. A neural network transliteration model in low resource settings. In Proceedings of the 16th International Conference of Machine Translation Summit. September 18-22, 2017, Nagoya, Japan, Volume 1. Research Track, 337--345.Google Scholar
- Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, and Wojciech Zaremba. 2014. Addressing the rare word problem in neural machine translation. arXiv preprint arXiv:1410.8206 (2014).Google Scholar
- Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic regularities in continuous space word representations. In hlt-Naacl, Vol. 13. 746--751.Google Scholar
- Hoang Gia Ngo, Nancy F. Chen, Binh Minh Nguyen, Bin Ma, and Haizhou Li. 2015. Phonology-augmented statistical transliteration for low-resource languages. In Proceedings of Interspeech. 3670--3674.Google Scholar
- Garrett Nicolai, Bradley Hauer, Mohammad Salameh, Adam St Arnaud, Ying Xu, Lei Yao, and Grzegorz Kondrak. 2015. Multiple system combination for transliteration. In Proceedings of NEWS 2015, the 5th Named Entities Workshop. 72--79.Google Scholar
Cross Ref
- Jong-Hoon Oh, Key-Sun Choi, and Hitoshi Isahara. 2006. A machine transliteration model based on correspondence between graphemes and phonemes. ACM Transactions on Asian Language Information Processing (TALIP) 5, 3 (2006), 185--208. Google Scholar
Digital Library
- Robert Östling and Jörg Tiedemann. 2017. Neural machine translation for low-resource languages. arXiv preprint arXiv:1708.05729 (2017).Google Scholar
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 311--318. Google Scholar
Digital Library
- Álvaro Peris. 2017. NMT-Keras. https://github.com/lvapeab/nmt-keras. GitHub repository.Google Scholar
- Hoang Phe. 1997. Vietnamese dictionary. Vietnam Lexicography Centre, Da Nang Publishing House (1997).Google Scholar
- Kanishka Rao, Fuchun Peng, Haşim Sak, and Françoise Beaufays. 2015. Grapheme-to-phoneme conversion using long short-term memory recurrent neural networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’15). IEEE, 4225--4229.Google Scholar
Cross Ref
- Mihaela Rosca and Thomas Breuel. 2016. Sequence-to-sequence neural network models for transliteration. arXiv preprint arXiv:1610.09565 (2016).Google Scholar
- Hassan Sajjad, Helmut Schmid, Alexander Fraser, and Hinrich Schütze. 2017. Statistical models for unsupervised, semi-supervised and supervised transliteration mining. Computational Linguistics 43, 2 (2017), 349--375.Google Scholar
Digital Library
- Yan Shao and Joakim Nivre. 2016. Applying neural networks to English-Chinese named EntityTransliteration. In Proceedings of the 6th Named Entity Workshop, Joint with 54th ACL, Berlin. 73--77.Google Scholar
Cross Ref
- Matthew G. Snover, Nitin Madnani, Bonnie Dorr, and Richard Schwartz. 2009. TER-Plus: Paraphrase, semantic, and alignment enhancements to translation edit rate. Machine Translation 23, 2--3 (2009), 117--127. Google Scholar
Digital Library
- Andreas Stolcke et al. 2002. SRILM-an extensible language modeling toolkit. In Interspeech, Vol. 2002. 2002.Google Scholar
- Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Advances in Neural Information Processing Systems. 3104--3112. Google Scholar
Digital Library
- Ye Kyaw Thu, Win Pa Pa, Yoshinori Sagisaka, and Naoto Iwahashi. 2016. Comparison of grapheme--to--phoneme conversion methods on a Myanmar pronunciation dictionary. In Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (2016). 11--22.Google Scholar
- Phuoc Tran, Dien Dinh, and Hien T. Nguyen. 2016. A character level based and word level based approach for Chinese-Vietnamese machine translation. Computational Intelligence and Neuroscience 2016, Article 9821608 (2016), 1--11. Google Scholar
Digital Library
- Raghavendra Udupa, K. Saravanan, A. Kumaran, and Jagadeesh Jagarlamudi. 2009. Mint: A method for effective and scalable mining of named entity transliterations from large comparable corpora. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 799--807. Google Scholar
Digital Library
- Sonjia Waxmonsky and Sravana Reddy. 2012. G2P conversion of proper names using word origin information. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 367--371. Google Scholar
Digital Library
- Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).Google Scholar
- Kaisheng Yao and Geoffrey Zweig. 2015. Sequence-to-sequence neural net models for grapheme-to-phoneme conversion. arXiv preprint arXiv:1506.00196 (2015).Google Scholar
- Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. 2016. Transfer learning for low-resource neural machine translation. arXiv preprint arXiv:1604.02201 (2016).Google Scholar
Index Terms
Low-Resource Machine Transliteration Using Recurrent Neural Networks
Recommendations
Machine transliteration survey
Machine transliteration is the process of automatically transforming the script of a word from a source language to a target language, while preserving pronunciation. The development of algorithms specifically for machine transliteration began over a ...
Enhancing recurrent neural network-based language models by word tokenization
Different approaches have been used to estimate language models from a given corpus. Recently, researchers have used different neural network architectures to estimate the language models from a given corpus using unsupervised learning neural networks ...
MorphoGen: Full Inflection Generation Using Recurrent Neural Networks
Computational Linguistics and Intelligent Text ProcessingAbstractSub-word level alternations during inflection (apophonies) are an common linguistic phenomenon present in morphologically-rich languages, like Romanian. Inflection learning, or predicting the inflection class of a partially regular or fully ...






Comments