Abstract
How to effectively model global context has been a critical challenge for document-level neural machine translation (NMT). Both preceding and global context have been carefully explored in the sequence-to-sequence (seq2seq) framework. However, previous studies generally map global context into one vector, which is not enough to well represent the entire document since this largely ignores the hierarchy between sentences and words within. In this article, we propose to model global context for source language from both sentence level and word level. Specifically at sentence level, we extract useful global context for the current sentence, while at word level, we compute global context against words within the current sentence. On this basis, both kinds of global context can be appropriately fused before being incorporated into the state-of-the-art seq2seq model, i.e., Transformer. Detailed experimentation on various document-level translation tasks shows that global context at both sentence level and word level significantly improve translation performance. More encouraging, both kinds of global context are complementary. This leads to more improvement when both kinds of global context are used.
- [1] . 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR.Google Scholar
- [2] . 2018. Evaluating discourse phenomena in neural machine translation. In Proceedings of NAACL. 1304–1313.Google Scholar
Cross Ref
- [3] . 2012. WIT3: Web inventory of transcribed and translated talks. In Proceedings of EAMT. 261–268.Google Scholar
- [4] . 2020. Hierarchical global context augmented document-level neural machine translation. In Proceedings of CCL. 434–445.Google Scholar
- [5] . 2015. Document-level machine translation with word vector models. In Proceedings of EAMT. 59–66.Google Scholar
- [6] . 2011. Cache-based document-level statistical machine translation. In Proceedings of EMNLP. 909–919.Google Scholar
- [7] . 2012. Document-wide decoding for phrase-based statistical machine translation. In Proceedings of EMNLP-CoNLL. 1179–1190.Google Scholar
- [8] . 2017. Does neural machine translation benefit from larger context?Computing Research Repository arXiv:1704.05135 (2017).Google Scholar
- [9] . 2020. Dynamic context selection for document-level neural machine translation via reinforcement learning. In Proceedings of EMNLP. 2242–2254.Google Scholar
Cross Ref
- [10] . 2015. Adam: A method for stochastic optimization. In Proceedings of ICLR.Google Scholar
- [11] . 2017. OpenNMT: Open-source toolkit for neural machine translation. In Proceedings of ACL 2017, System Demonstrations. 67–72.Google Scholar
Cross Ref
- [12] . 2004. Statistical significance tests for machine translation evaluation. In Proceedings of EMNLP. 388–395.Google Scholar
- [13] . 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of ACL 2007, System Demonstrations. 177–180.Google Scholar
Cross Ref
- [14] . 2018. Modeling coherence for neural machine translation with dynamic and topic caches. In Proceedings of COLING. 596–606.Google Scholar
- [15] . 2007. METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of WMT. 228–231.Google Scholar
Cross Ref
- [16] . 2020. Does multi-encoder help? A case study on context-aware neural machine translation. In Proceedings of ACL. 3512–3518.Google Scholar
Cross Ref
- [17] . 2017. A structured self-attentive sentence embedding. In Proceedings of ICLR.Google Scholar
- [18] . 2019. Using whole document context in neural machine translation. In Proceedings of IWSLT.Google Scholar
- [19] . 2018. Document context neural machine translation with memory networks. In Proceedings of ACL. 1275–1284.Google Scholar
Cross Ref
- [20] . 2019. Selective attention for context-aware neural machine translation. In Proceedings of NAACL. 3092–3102.Google Scholar
Cross Ref
- [21] . 2018. Document-level neural machine translation with hierarchical attention networks. In Proceedings of EMNLP. 2947–2954.Google Scholar
Cross Ref
- [22] . 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of ACL. 311–318.Google Scholar
- [23] . 2016. Neural machine translation of rare words with subword units. In Proceedings of ACL. 1715–1725.Google Scholar
Cross Ref
- [24] . 2020. Capturing longer context for document-level neural machine translation: A multi-resolutional approach. Computing Research Repository arXiv:2010.08961 (2020).Google Scholar
- [25] . 2019. Hierarchical modeling of global context for document-level neural machine translation. In Proceedings of EMNLP-IJCNLP. 1576–1585.Google Scholar
Cross Ref
- [26] . 2017. Neural machine translation with extended context. In Proceedings of the 3rd Workshop on Discourse in Machine Translation. 82–92.Google Scholar
Cross Ref
- [27] . 2014. Enhancing grammatical cohesion: Generating transitional expressions for SMT. In Proceedings of ACL. 850–860.Google Scholar
Cross Ref
- [28] . 2018. Learning to remember translation history with a continuous cache. Transactions of the Association for Computational Linguistics 6 (2018), 407–420.Google Scholar
Cross Ref
- [29] . 2017. Attention is all you need. In Proceedings of NIPS. 5998–6008.Google Scholar
- [30] . 2019. When a good translation is wrong in context: Context-aware machine translation improves on deixis, ellipsis, and lexical cohesion. In Proceedings of ACL. 1198–1212.Google Scholar
Cross Ref
- [31] . 2018. Context-aware neural machine translation learns anaphora resolution. In Proceedings of ACL. 1264–1274.Google Scholar
Cross Ref
- [32] . 2017. Exploiting cross-sentence context for neural machine translation. In Proceedings of EMNLP. 2826–2831.Google Scholar
Cross Ref
- [33] . 2017. Validation of an automatic metric for the accuracy of pronoun translation (APT). In Proceedings of Workshop on Discourse in Machine Translation. 17–25.Google Scholar
Cross Ref
- [34] . 2013. Lexical chain based cohesion models for document-level statistical machine translation. In Proceedings of EMNLP. 1563–1573.Google Scholar
- [35] . 2019. Modeling coherence for discourse neural machine translation. In Proceedings of AAAI. 7338–7345.Google Scholar
Digital Library
- [36] . 2019. Enhancing context modeling with a query-guided capsule network for document-level translation. In Proceedings of EMNLP. 1527–1537.Google Scholar
Cross Ref
- [37] . 2018. Improving the transformer translation model with document-level context. In Proceedings of EMNLP. 533–542.Google Scholar
Cross Ref
- [38] . 2020. Towards making the most of context in neural machine translation. In Proceedings of IJCAI. 3983–3989.Google Scholar
Cross Ref
Index Terms
One Type Context Is Not Enough: Global Context-aware Neural Machine Translation
Recommendations
A study of BERT for context-aware neural machine translation
AbstractContext-aware neural machine translation (NMT), which targets at translating sentences with contextual information, has attracted much attention recently. A key problem for context-aware NMT is to effectively encode and aggregate the contextual ...
Document-Level Neural Machine Translation with Hierarchical Modeling of Global Context
AbstractDocument-level machine translation (MT) remains challenging due to its difficulty in efficiently using document-level global context for translation. In this paper, we propose a hierarchical model to learn the global context for document-level ...
Neural Machine Translation With Sentence-Level Topic Context
Traditional neural machine translation NMT methods use the word-level context to predict target language translation while neglecting the sentence-level context, which has been shown to be beneficial for translation prediction in statistical machine ...






Comments