skip to main content
research-article

Exploration of Effective Attention Strategies for Neural Automatic Post-editing with Transformer

Authors Info & Claims
Published:12 August 2021Publication History
Skip Abstract Section

Abstract

Automatic post-editing (APE) is the study of correcting translation errors in the output of an unknown machine translation (MT) system and has been considered as a method of improving translation quality without any modification to conventional MT systems. Recently, several variants of Transformer that take both the MT output and its corresponding source sentence as inputs have been proposed for APE; and models introducing an additional attention layer into the encoder to jointly encode the MT output with its source sentence recorded a high-rank in the WMT19 APE shared task. We examine the effectiveness of such joint-encoding strategy in a controlled environment and compare four types of decoder multi-source attention strategies that have been introduced into previous APE models. The experimental results indicate that the joint-encoding strategy is effective and that taking the final encoded representation of the source sentence is the more proper strategy than taking such representation within the same encoder stack. Furthermore, among the multi-source attention strategies combined with the joint-encoding, the strategy that applies attention to the concatenated input representation and the strategy that adds up the individual attention to each input improve the quality of APE results over the strategy using the joint-encoding only.

References

  1. Jeffrey Allen and Christopher Hogan. 2000. Toward the development of a post editing module for raw machine translation output: A controlled language perspective. In 3rd International Controlled Language Applications Workshop (CLAW'00). 62–71.Google ScholarGoogle Scholar
  2. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations (ICLR'15), Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.0473.Google ScholarGoogle Scholar
  3. Hanna Béchara, Yanjun Ma, and Josef van Genabith. 2011. Statistical post-editing for a statistical MT system. In MT Summit, Vol. 13. Asia-Pacific Association for Machine Translation, 308–315.Google ScholarGoogle Scholar
  4. Alexandre Bérard, Laurent Besacier, and Olivier Pietquin. 2017. LIG-CRIStAL submission for the WMT 2017 automatic post-editing task. In Proceedings of the 2nd Conference on Machine Translation. Association for Computational Linguistics, 623–629. DOI:https://doi.org/10.18653/v1/W17-4772Google ScholarGoogle ScholarCross RefCross Ref
  5. Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shujian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, and Marco Turchi. 2017. Findings of the 2017 conference on machine translation (WMT'17). In Proceedings of the 2nd Conference on Machine Translation. Association for Computational Linguistics, 169–214. DOI:https://doi.org/10.18653/v1/W17-4717Google ScholarGoogle Scholar
  6. Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, and Marcos Zampieri. 2016. Findings of the 2016 conference on machine translation. In Proceedings of the 1st Conference on Machine Translation: Volume 2, Shared Task Papers. Association for Computational Linguistics, 131–198. DOI:https://doi.org/10.18653/v1/W16-2301Google ScholarGoogle Scholar
  7. Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia, and Marco Turchi. 2015. Findings of the 2015 workshop on statistical machine translation. In Proceedings of the 10th Workshop on Statistical Machine Translation. Association for Computational Linguistics, 1–46. DOI:https://doi.org/10.18653/v1/W15-3001Google ScholarGoogle Scholar
  8. Rajen Chatterjee, M. Amin Farajian, Matteo Negri, Marco Turchi, Ankit Srivastava, and Santanu Pal. 2017. Multi-source neural automatic post-editing: FBK's participation in the WMT 2017 APE shared task. In Proceedings of the 2nd Conference on Machine Translation. Association for Computational Linguistics, 630–638. DOI:https://doi.org/10.18653/v1/W17-4773Google ScholarGoogle ScholarCross RefCross Ref
  9. Rajen Chatterjee, Christian Federmann, Matteo Negri, and Marco Turchi. 2019. Findings of the WMT 2019 shared task on automatic post-editing. In Proceedings of the 4th Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2). Association for Computational Linguistics, 11–28. DOI:https://doi.org/10.18653/v1/W19-5402Google ScholarGoogle Scholar
  10. Rajen Chatterjee, Matteo Negri, Raphael Rubino, and Marco Turchi. 2018. Findings of the WMT 2018 shared task on automatic post-editing. In Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers. Association for Computational Linguistics, 710–725. DOI:https://doi.org/10.18653/v1/W18-6452Google ScholarGoogle Scholar
  11. Rajen Chatterjee, Marion Weller, Matteo Negri, and Marco Turchi. 2015. Exploring the planet of the apes: A comparative study of state-of-the-art methods for MT automatic post-editing. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 156–161.Google ScholarGoogle ScholarCross RefCross Ref
  12. Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'14). Association for Computational Linguistics, 1724–1734. DOI:https://doi.org/10.3115/v1/D14-1179Google ScholarGoogle ScholarCross RefCross Ref
  13. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.Google ScholarGoogle Scholar
  14. Chris Hokamp. 2017. Ensembling factored neural machine translation models for automatic post-editing and quality estimation. In Proceedings of the 2nd Conference on Machine Translation. Association for Computational Linguistics, 647–654. DOI:https://doi.org/10.18653/v1/W17-4775Google ScholarGoogle ScholarCross RefCross Ref
  15. Marcin Junczys-Dowmunt and Roman Grundkiewicz. 2016. Log-linear combinations of monolingual and bilingual neural machine translation models for automatic post-editing. In Proceedings of the 1st Conference on Machine Translation: Volume 2, Shared Task Papers. Association for Computational Linguistics, 751–758. DOI:https://doi.org/10.18653/v1/W16-2378Google ScholarGoogle ScholarCross RefCross Ref
  16. Marcin Junczys-Dowmunt and Roman Grundkiewicz. 2017. The AMU-UEdin submission to the WMT 2017 shared task on automatic post-editing. In Proceedings of the 2nd Conference on Machine Translation. Association for Computational Linguistics, 639–646. DOI:https://doi.org/10.18653/v1/W17-4774Google ScholarGoogle ScholarCross RefCross Ref
  17. Marcin Junczys-Dowmunt and Roman Grundkiewicz. 2018. MS-UEdin submission to the WMT2018 APE shared task: Dual-source transformer for automatic post-editing. In Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers. Association for Computational Linguistics, 822–826. DOI:https://doi.org/10.18653/v1/W18-6467Google ScholarGoogle ScholarCross RefCross Ref
  18. Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, and Alexander M. Rush. 2017. OpenNMT: Open-source toolkit for neural machine translation. In Proceedings of Meeting of the Association for Computational Linguistics: System Demonstrations. 67–72.Google ScholarGoogle Scholar
  19. Philipp Koehn and Kevin Knight. 2003. Feature-rich statistical translation of noun phrases. In Proceedings of the 41st Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 311–318. DOI:https://doi.org/10.3115/1075096.1075136 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Taku Kudo. 2018. Subword regularization: Improving neural network translation models with multiple subword candidates. In Proceedings of the 56th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 66–75.Google ScholarGoogle ScholarCross RefCross Ref
  21. WonKee Lee, Junsu Park, Byung-Hyun Go, and Jong-Hyeok Lee. 2019a. Transformer-based automatic post-editing with a context-aware encoding approach for multi-source inputs. arXiv preprint arXiv:1908.05679 (2019).Google ScholarGoogle Scholar
  22. WonKee Lee, Jaehun Shin, and Jong-Hyeok Lee. 2019b. Transformer-based automatic post-editing model with joint encoder and multi-source attention of decoder. In Proceedings of the 4th Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2). Association for Computational Linguistics, 112–117. DOI:https://doi.org/10.18653/v1/W19-5412Google ScholarGoogle ScholarCross RefCross Ref
  23. Jindřich Libovický, Jindřich Helcl, Marek Tlustý, Ondřej Bojar, and Pavel Pecina. 2016. CUNI system for WMT16 automatic post-editing and multimodal translation tasks. In Proceedings of the 1st Conference on Machine Translation: Volume 2, Shared Task Papers. Association for Computational Linguistics, 646–654. DOI:https://doi.org/10.18653/v1/W16-2361Google ScholarGoogle ScholarCross RefCross Ref
  24. António V. Lopes, M. Amin Farajian, Gonçalo M. Correia, Jonay Trénous, and André F. T. Martins. 2019. Unbabel's submission to the WMT2019 APE shared task: BERT-based encoder-decoder for automatic post-editing. In Proceedings of the 4th Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2). Association for Computational Linguistics, 118–123. DOI:https://doi.org/10.18653/v1/W19-5413Google ScholarGoogle Scholar
  25. Daniel Marcu and Daniel Wong. 2002. A phrase-based, joint probability model for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'02). 133–139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Matteo Negri, Marco Turchi, Rajen Chatterjee, and Nicola Bertoldi. 2018. eSCAPE: A large-scale synthetic corpus for automatic post-editing. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC'18). European Language Resources Association (ELRA), 24–30.Google ScholarGoogle Scholar
  27. Santanu Pal, Nico Herbig, Antonio Krüger, and Josef van Genabith. 2018. A transformer-based multi-source automatic post-editing system. In Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers. Association for Computational Linguistics, 827–835. DOI:https://doi.org/10.18653/v1/W18-6468Google ScholarGoogle ScholarCross RefCross Ref
  28. Santanu Pal, Hongfei Xu, Nico Herbig, Antonio Krüger, and Josef van Genabith. 2019. USAAR—-The transference architecture for English–German automatic post-editing. In Proceedings of the 4th Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2). Association for Computational Linguistics, 124–131. DOI:https://doi.org/10.18653/v1/W19-5414Google ScholarGoogle ScholarCross RefCross Ref
  29. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 311–318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Mirko Plitt and François Masselot. 2010. A productivity test of statistical machine translation post-editing in a typical localisation context. Prague Bull. Math. Ling. 93 (2010), 7–16.Google ScholarGoogle ScholarCross RefCross Ref
  31. Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu. 2016. Minimum risk training for neural machine translation. In Proceedings of the 54th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1683–1692.Google ScholarGoogle ScholarCross RefCross Ref
  32. Jaehun Shin and Jong-Hyeok Lee. 2018. Multi-encoder transformer network for automatic post-editing. In Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers. Association for Computational Linguistics, 840–845. DOI:https://doi.org/10.18653/v1/W18-6470Google ScholarGoogle ScholarCross RefCross Ref
  33. Michel Simard, Cyril Goutte, and Pierre Isabelle. 2007. Statistical phrase-based post-editing. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 508–515.Google ScholarGoogle Scholar
  34. Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of Association for Machine Translation in the Americas. 223–231.Google ScholarGoogle Scholar
  35. Amirhossein Tebbifakhr, Ruchit Agrawal, Matteo Negri, and Marco Turchi. 2018. Multi-source transformer with combined losses for automatic post editing. In Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers. Association for Computational Linguistics, 846–852. DOI:https://doi.org/10.18653/v1/W18-6471Google ScholarGoogle ScholarCross RefCross Ref
  36. Dušan Variš and Ondřej Bojar. 2017. CUNI system for WMT17 automatic post-editing task. In Proceedings of the 2nd Conference on Machine Translation. Association for Computational Linguistics, 661–666. DOI:https://doi.org/10.18653/v1/W17-4777Google ScholarGoogle ScholarCross RefCross Ref
  37. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017), 5998–6008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Hongfei Xu, Qiuhui Liu, and Josef van Genabith. 2019. UdS submission for the WMT 19 automatic post-editing task. In Proceedings of the 4th Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2). Association for Computational Linguistics, 145–150. DOI:https://doi.org/10.18653/v1/W19-5417Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Exploration of Effective Attention Strategies for Neural Automatic Post-editing with Transformer

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 6
        November 2021
        439 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3476127
        Issue’s Table of Contents

        Copyright © 2021 Association for Computing Machinery.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 August 2021
        • Accepted: 1 May 2021
        • Revised: 1 December 2020
        • Received: 1 April 2020
        Published in tallip Volume 20, Issue 6

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!