skip to main content
research-article

Explicitly Modeling Word Translations in Neural Machine Translation

Authors Info & Claims
Published:23 July 2019Publication History
Skip Abstract Section

Abstract

In this article, we show that word translations can be explicitly incorporated into NMT effectively to avoid wrong translations. Specifically, we propose three cross-lingual encoders to explicitly incorporate word translations into NMT: (1) Factored encoder, which encodes a word and its translation in a vertical way; (2) Gated encoder, which uses a gated mechanism to selectively control the amount of word translations moving forward; and (3) Mixed encoder, which stitchingly learns a word and its translation annotations over sequences where words and their translations are alternatively mixed. Besides, we first use a simple word dictionary approach and then a word sense disambiguation (WSD) approach to effectively model the word context for better word translation. Experimentation on Chinese-to-English translation demonstrates that all proposed encoders are able to improve the translation accuracy for both traditional RNN-based NMT and recent self-attention-based NMT (hereafter referred to as Transformer). Specifically, Mixed encoder yields the most significant improvement of 2.0 in BLEU on the RNN-based NMT, while Gated encoder improves 1.2 in BLEU on Transformer. This indicates the usefulness of an WSD approach in modeling word context for better word translation. This also indicates the effectiveness of our proposed cross-lingual encoders in explicitly modeling word translations to avoid wrong translations in NMT. Finally, we discuss in depth how word translations benefit different NMT frameworks from several perspectives.

References

  1. Roee Aharoni and Yoav Goldberg. 2017. Towards string-to-tree neural machine translation. In Proceedings of the ACL 2017 (Volume 2: Short Papers). 132--140.Google ScholarGoogle ScholarCross RefCross Ref
  2. Philip Arthur, Graham Neubig, and Satoshi Nakamura. 2016. Incorporating discrete translation lexicons into neural machine translation. In Proceedings of the EMNLP 2016. 1557--1567.Google ScholarGoogle ScholarCross RefCross Ref
  3. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the ICLR 2015.Google ScholarGoogle Scholar
  4. Rajen Chatterjee, Matteo Negri, Marco Turchi, Marcello Federico, Lucia Specia, and Frédéric Blain. 2017. Guiding neural machine translation decoding with external knowledge. In Proceedings of the WMT 2017. 157--168.Google ScholarGoogle ScholarCross RefCross Ref
  5. Huadong Chen, Shujian Huang, David Chiang, and Jiajun Chen. 2017. Improved neural machine translation with a syntax-aware encoder and decoder. In Proceedings of the ACL 2017. 1936--1945.Google ScholarGoogle ScholarCross RefCross Ref
  6. Kehai Chen, Rui Wang, Masao Utiyama, Lemao Liu, Akihiro Tamura, Eiichiro Sumita, and Tiejun Zhao. 2017. Neural machine translation with source dependency representation. In Proceedings of the EMNLP 2017. 2846--2852.Google ScholarGoogle ScholarCross RefCross Ref
  7. Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the EMNLP 2014. 1724--1734.Google ScholarGoogle ScholarCross RefCross Ref
  8. Chris Dyer, Victor Chahuneau, and Noah A. Smith. 2013. A simple, fast, and effective reparameterization of IBM model 2. In Proceedings of the NAACL 2013. 644--648.Google ScholarGoogle Scholar
  9. Chris Dyer, Adam Lopez, Juri Ganitkevitch, Jonathan Weese, Ferhan Ture, Phil Blunsom, Hendra Setiawan, Vladimir Eidelman, and Philip Resnik. 2010. cdec: A decoder, alignment, and learning framework for finite-state and context-free translation models. In Proceedings of the ACL 2010 System Demonstrations. 7--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Akiko Eriguchi, Kazuma Hashimoto, and Yoshimasa Tsuruoka. 2016. Tree-to-sequence attentional neural machine translation. In Proceedings of the ACL 2016. 823--833.Google ScholarGoogle ScholarCross RefCross Ref
  11. Akiko Eriguchi, Yoshimasa Tsuruoka, and Kyunghyun Cho. 2017. Learning to parse and translate improves neural machine translation. In Proceedings of the ACL 2017 (Volume 2: Short Papers). 72--78.Google ScholarGoogle ScholarCross RefCross Ref
  12. Sébastien Jean, Orhan Firat, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio. 2015. Montreal neural machine translation systems for WMT’15. In Proceedings of the WMT 2015. 134--140.Google ScholarGoogle ScholarCross RefCross Ref
  13. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of ICLR.Google ScholarGoogle Scholar
  14. Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the EMNLP 2004. 388--395.Google ScholarGoogle Scholar
  15. Shaohui Kuang, Junhui Li, Antonio Branco, Weihua Luo, and Deyi Xiong. 2018. Attention focusing for neural machine translation by bridging source and target embeddings. In Proceedings of the ACL 2018. 1767--1776.Google ScholarGoogle ScholarCross RefCross Ref
  16. Junhui Li, Deyi Xiong, Zhaopeng Tu, Muhua Zhu, Min Zhang, and Guodong Zhou. 2017. Modeling source syntax for neural machine translation. In Proceedings of the ACL 2017. 688--697.Google ScholarGoogle ScholarCross RefCross Ref
  17. Frederick Liu, Han Lu, and Graham Neubig. 2018. Handling homographs in neural machine translation. In Proceedings of the NAACL 2018.Google ScholarGoogle ScholarCross RefCross Ref
  18. Lemao Liu, Masao Utiyama, Andrew Finch, and Eiichiro Sumita. 2016. Neural machine translation with supervised attention. In Proceedings of the COLING 2016. 3093--3102.Google ScholarGoogle Scholar
  19. Yang Liu and Maosong Sun. 2015. Contrastive unsupervised word alignment with non-local features. In Proceedings of the AAAI 2015. 857--868. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the EMNLP 2015. 1412--1421.Google ScholarGoogle Scholar
  21. Benjamin Marie and Atsushi Fujita. 2018. Phrase table induction using monolingual data for low-resource statistical machine translation. ACM Trans. Asian Low-Resource Lang. Inf. Proc. 17, 3 (2018), Article No. 16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Haitao Mi, Zhiguo Wang, and Abe Ittycheriah. 2016. Supervised attentions for neural machine translation. In Proceedings of the EMNLP 2016. 2283--2288.Google ScholarGoogle ScholarCross RefCross Ref
  23. Toan Q. Nguyen and David Chiang. 2017. Improving lexical choice in neural machine translation. Retrieved from: arXiv:1710.01329.Google ScholarGoogle Scholar
  24. Jan Niehues, Eunah Cho, Thanh-Le Ha, and Alex Waibel. 2016. Pre-translation for neural machine translation. In Proceedings of the COLING 2016. 1828--1836.Google ScholarGoogle Scholar
  25. Franz J. Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Comput. Ling. 29, 1 (2003), 19--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the ACL 2002. 311--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ofir Press and Lior Wolf. 2017. Using the output embedding to improve language models. In Proceedings of the EACL 2017. 157--163.Google ScholarGoogle ScholarCross RefCross Ref
  28. Alessandro Raganato, Claudio Delli Bovi, and Roberto Navigli. 2017. Neural sequence learning models for word sense disambiguation. In Proceedings of the EMNLP 2017. 1156--1167.Google ScholarGoogle ScholarCross RefCross Ref
  29. Annette Rios, Laura Mascarell, and Rico Sennrich. 2017. Improving word sense disambiguation in neural machine translation with sense embeddings. In Proceedings of the WMT 2017.Google ScholarGoogle Scholar
  30. Rico Sennrich and Barry Haddow. 2016. Linguistic input features improve neural machine translation. In Proceedings of the 1st Conference on Machine Translation. 83--91.Google ScholarGoogle ScholarCross RefCross Ref
  31. Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the ACL 2016. 1715--1725.Google ScholarGoogle ScholarCross RefCross Ref
  32. Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the AMTA.Google ScholarGoogle Scholar
  33. Phuoc Tran, Dien Dinh, and Long H. B. Nguyen. 2016. Word re-segmentation in Chinese-Vietnamese machine translation. ACM Trans. Asian Low-Resource Lang. Inf. Proc. 16, 2 (2016), Article No. 12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Zhaopeng Tu, Yang Liu, Lifeng Shang, Xiaohua Liu, and Hang Li. 2017. Neural machine translation with reconstruction. In Proceedings of the AAAI 2017. 3097--3103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the NIPS 2017. 6000--6010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xing Wang, Zhengdong Lu, Zhaopeng Tu, Hang Li, Deyi Xiong, and Min Zhang. 2017. Neural machine translation advised by statistical machine translation. In Proceedings of the AAAI 2017. 3330--3336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Xing Wang, Zhaopeng Tu, Deyi Xiong, and Min Zhang. 2017. Translating phrases in neural machine translation. In Proceedings of the EMNLP 2017. 1421--1431.Google ScholarGoogle ScholarCross RefCross Ref
  38. Rongxiang Weng, Shujian Huang, Zaixiang Zheng, XIN-YU DAI, and Jiajun CHEN. 2017. Neural machine translation with word predictions. In Proceedings of the EMNLP 2017. 136--145.Google ScholarGoogle ScholarCross RefCross Ref
  39. Shuangzhi Wu, Dongdong Zhang, Nan Yang, Mu Li, and Ming Zhou. 2017. Sequence-to-dependency neural machine translation. In Proceedings of the ACL 2017. 698--707.Google ScholarGoogle ScholarCross RefCross Ref
  40. Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. Retrieved from: arXiv:1609.08144.Google ScholarGoogle Scholar
  41. Matthew D. Zeiler. 2012. ADADELTA: An adaptive learning rate method. Retrieved from: arXiv:1212.5701.Google ScholarGoogle Scholar
  42. Long Zhou, Wenpeng Hu, Jiajun Zhang, and Chengqing Zong. 2017. Neural system combination for machine translation. In Proceedings of the ACL 2017 (Volume 2: Short Papers). 378--384.Google ScholarGoogle ScholarCross RefCross Ref
  43. Barret Zoph and Kevin Knight. 2016. Multi-source neural translation. In Proceedings of the HLT-NAACL 2016. 30--34.Google ScholarGoogle Scholar

Index Terms

  1. Explicitly Modeling Word Translations in Neural Machine Translation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!