skip to main content
note

Adversarial Training for Unknown Word Problems in Neural Machine Translation

Authors Info & Claims
Published:21 August 2019Publication History
Skip Abstract Section

Abstract

Nearly all of the work in neural machine translation (NMT) is limited to a quite restricted vocabulary, crudely treating all other words the same as an < unk> symbol. For the translation of language with abundant morphology, unknown (UNK) words also come from the misunderstanding of the translation model to the morphological changes. In this study, we explore two ways to alleviate the UNK problem in NMT: a new generative adversarial network (added value constraints and semantic enhancement) and a preprocessing technique that mixes morphological noise. The training process is like a win-win game in which the players are three adversarial sub models (generator, filter, and discriminator). In this game, the filter is to emphasize the discriminator’s attention to the negative generations that contain noise and improve the training efficiency. Finally, the discriminator cannot easily discriminate the negative samples generated by the generator with filter and human translations. The experimental results show that the proposed method significantly improves over several strong baseline models across various language pairs and the newly emerged Mongolian-Chinese task is state-of-the-art.

References

  1. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv 1409, 0473.Google ScholarGoogle Scholar
  2. Xilun Chen, Yu Sun, et al. 2016. Adversarial deep averaging networks for cross-lingual sentiment classification. In Association for Computational Linguistics (ACL’16). 557--570.Google ScholarGoogle Scholar
  3. J. Chung, C. Gulcehre, K. H. Cho, et al. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv 4, 3555.Google ScholarGoogle Scholar
  4. P. Dayan and L. F. Abbott. 2003. Theoretical neuroscience: Computational and mathematical modelling of neural systems. Journal of Cognitive Neuroscience 15, 1 (2003), 154--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Yue Wang Fei, Zhang Jie, et al. 2017. PDP: Parallel dynamic programming. IEEE/CAA Journal of Automatica Sinica and IEEE 4, 1 (2017), 1--5.Google ScholarGoogle ScholarCross RefCross Ref
  6. Jonas Gehring, Michael Auli, David Grangier, et al. 2017. Convolutional sequence to sequence learning. In International Conference on Machine Learning (ICML’17). 1243--1252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv 1308.0850.Google ScholarGoogle Scholar
  8. Caglar Gulcehre, Sungjin Ahn, et al. 2016. Pointing the unknown words. In Association for Computational Linguistics (ACL’16). 140--149.Google ScholarGoogle Scholar
  9. Karl Moritz Hermann, Tomáŝ Koĉiský, et al. 2015. Teaching machines to read and comprehend. In Conference and Workshop on Neural Information Processing Systems (NIPS’15). 1693--1701. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sébastien Jean, Kyunghyun Cho, et al. 2015. On using very large target vocabulary for neural machine translation. In Association for Computational Linguistics (ACL’15). 1--10.Google ScholarGoogle Scholar
  11. Minh-Thang Luong, Ilya Sutskever, et al. 2015. Addressing the rare word problem in neural machine translation. In International Joint Conference on Natural Language Processing (IJCNLP’15). 11--19.Google ScholarGoogle ScholarCross RefCross Ref
  12. F. Meunier and C. M. Longtin. 2007. Morphological decomposition and semantic integration in word processing. Journal of Memory and Language 56, 4 (2007), 457--471.Google ScholarGoogle ScholarCross RefCross Ref
  13. Tomáš Mikolov, Stefan Kombrink, et al. 2011. Extensions of recurrent neural network language model. In International Conference on Acoustics, Speech, and Signal Processing. 5528--5531.Google ScholarGoogle ScholarCross RefCross Ref
  14. Volodymyr Mnih, Koray Kavukcuoglu, et al. 2013. Playing Atari with deep reinforcement learning. arXiv 1312.5602 (2013).Google ScholarGoogle Scholar
  15. Frederic Morin and Yoshua Bengio. 2005. Hierarchical probabilistic neural network language model. In International Conference on Artificial Intelligence and Statistics (AiStats’05). 246--252.Google ScholarGoogle Scholar
  16. Kishore Papineni, Salim Roukos, et al. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02). 311--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Marc’Aurelio Ranzato, Sumit Chopra, et al. 2015. Sequence level training with recurrent neural networks. arXiv 1511.06732 (2015).Google ScholarGoogle Scholar
  18. Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Association for Computational Linguistics (ACL’16). 1715--1725.Google ScholarGoogle Scholar
  19. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Conference and Workshop on Neural Information Processing Systems (NIPS’14). 3104--3112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Tamar, Y. Wu, G. Thomas, et al. 2016. Value iteration networks. In Neural Information Processing Systems (NIPS’16). 2154--2162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ashish Vaswani, Noam Shazeer, et al. 2017. Attention is all you need. In Conference and Workshop on Neural Information Processing Systems (NIPS’17). 5998--6008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. Wu, Y. Xia, L. Zhao, et al. 2018. Adversarial neural machine translation. In Asian Conference on Machine Learning (ACML’18). 374--385.Google ScholarGoogle Scholar
  23. Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2016. SeqGAN: Sequence generative adversarial nets with policy gradient. In The Association for the Advancement of Artificial Intelligence (AAAI’16). 2852--2858. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yuan Zhang, Regina Barzilay, and Tommi Jaakkola. 2017. Aspect-augmented adversarial networks for domain adaptation. Transactions of the Association for Computational Linguistics 5, 1 (2017), 515--528.Google ScholarGoogle ScholarCross RefCross Ref
  25. Zhen Yang, Wei Chen, Feng Wang, and Bo Xu. 2018. Improving neural machine translation with conditional sequence generative adversarial nets. In The North American Chapter of the Association for Computational Linguistics (NAACL’18). 1346--1355.Google ScholarGoogle Scholar
  26. Jun Yan Zhu, Taesung Park, et al. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In IEEE International Conference on Computer Vision (ICCV’17). 2223--2232.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Adversarial Training for Unknown Word Problems in Neural Machine Translation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 1
        January 2020
        345 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3338846
        Issue’s Table of Contents

        Copyright © 2019 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 21 August 2019
        • Accepted: 1 June 2019
        • Revised: 1 May 2019
        • Received: 1 January 2019
        Published in tallip Volume 19, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • note
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!