Abstract
Bilingual word embedding has been shown to be helpful for Statistical Machine Translation (SMT). However, most existing methods suffer from two obvious drawbacks. First, they only focus on simple contexts such as an entire document or a fixed-sized sliding window to build word embedding and ignore latent useful information from the selected context. Second, the word sense but not the word should be the minimal semantic unit; however, most existing methods still use word representation.
To overcome these drawbacks, this article presents a novel Graph-Based Bilingual Word Embedding (GBWE) method that projects bilingual word senses into a multidimensional semantic space. First, a bilingual word co-occurrence graph is constructed using the co-occurrence and pointwise mutual information between the words. Then, maximum complete subgraphs (cliques), which play the role of a minimal unit for bilingual sense representation, are dynamically extracted according to the contextual information. Consequently, correspondence analysis, principal component analyses, and neural networks are used to summarize the clique-word matrix into lower dimensions to build the embedding model.
Without contextual information, the proposed GBWE can be applied to lexical translation. In addition, given contextual information, GBWE is able to give a dynamic solution for bilingual word representations, which can be applied to phrase translation and generation. Empirical results show that GBWE can enhance the performance of lexical translation, as well as Chinese/French-to-English and Chinese-to-Japanese phrase-based SMT tasks (IWSLT, NTCIR, NIST, and WAT).
- Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2016. Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2289--229.Google Scholar
Cross Ref
- Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2017. Learning bilingual word embeddings with (almost) no bilingual data. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 451--462.Google Scholar
Cross Ref
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations. San Diego, USA.Google Scholar
- Marco Baroni, Georgiana Dinu, and Germán Kruszewski. 2014. Don’t count, predict! A systematic comparison of context-counting5 vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 238--247.Google Scholar
Cross Ref
- Jean-Paul Benzécri. 1973. L’Analyse des correspondances. In L’Analyse des Données, Vol. II. Dunod, Paris.Google Scholar
- Adam L. Berger, Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, John R. Gillett, John D. Lafferty, Robert L. Mercer, Harry Printz, and Luboš Ureš. 1994. The Candide system for machine translation. In Proceedings of the Workshop on Human Language Technology (HLT’94). 157--162. Google Scholar
Digital Library
- Arianna Bisazza, Nick Ruiz, Marcello Federico, and FBK-Fondazione Bruno Kessler. 2011. Fill-up versus interpolation methods for phrase-based SMT adaptation. In Proceedings of the International Workshop on Spoken Language Translation. 136--143.Google Scholar
- David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (2003), 993--1022. Google Scholar
Digital Library
- John Adrian Bondy and Uppaluri Siva Ramachandra Murty. 1976. Graph Theory with Applications. Vol. 290. Macmillan, London.Google Scholar
- Peter F. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19, 2 (June 1993), 263--311. Google Scholar
Digital Library
- Hailong Cao, Tiejun Zhao, Shu Zhang, and Yao Meng. 2016. A distribution-based model to learn bilingual word embeddings. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 1818--1827.Google Scholar
- Daniel Cer, Michel Galley, Daniel Jurafsky, and Christopher D. Manning. 2010. Phrasal: A statistical machine translation toolkit for exploring new model features. In Proceedings of the NAACL HLT 2010 Demonstration Session. 9--12. Google Scholar
Digital Library
- Mauro Cettolo, Christian Girardi, and Marcello Federico. 2012. WIT: Web inventory of transcribed and translated talks. In Proceedings of the 16th Conference of the European Association for Machine Translation. 261--268.Google Scholar
- Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, and Tiejun Zhao. 2017. Syntax-directed attention for neural machine translation. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18).Google Scholar
- Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder--Decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 1724--1734.Google Scholar
Cross Ref
- Jonathan H. Clark, Chris Dyer, Alon Lavie, and Noah A. Smith. 2011. Better hypothesis testing for statistical machine translation: Controlling for optimizer instability. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 176--181. Google Scholar
Digital Library
- Jocelyn Coulmance, Jean-Marc Marty, Guillaume Wenzek, and Amine Benhalloum. 2015. Trans-gram, fast cross-lingual word-embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1109--1113.Google Scholar
Cross Ref
- Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, and John Makhoul. 2014. Fast and robust neural network joint models for statistical machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1370--1380.Google Scholar
Cross Ref
- Allyson Ettinger, Philip Resnik, and Marine Carpuat. 2016. Retrofitting sense-specific word vectors using parallel text. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1378--1383.Google Scholar
Cross Ref
- Manaal Faruqui and Chris Dyer. 2014. Improving vector space word representations using multilingual correlation. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 462--471.Google Scholar
Cross Ref
- Michel Galley, P. Chang, Daniel Cer, Jenny R. Finkel, and Christopher D. Manning. 2008. NIST open machine translation 2008 evaluation: Stanford University’s system description. In Unpublished Working Notes of the 2008 NIST Open Machine Translation Evaluation Workshop. Citeseer.Google Scholar
- Jianfeng Gao, Xiaodong He, Wen-tau Yih, and Li Deng. 2014. Learning continuous phrase representations for translation modeling. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 699--709.Google Scholar
Cross Ref
- Isao Goto, Bin Lu, Ka Po Chow, Eiichiro Sumita, and Benjamin K. Tsou. 2011. Overview of the patent machine translation task at the NTCIR-9 workshop. In Proceedings of NTCIR-9 Workshop Meeting. 559--578.Google Scholar
- Stephan Gouws, Yoshua Bengio, and Greg Corrado. 2015. Bilbowa: Fast bilingual distributed representations without word alignments. In Proceedings of the 32nd International Conference on Machine Learning. Google Scholar
Digital Library
- Stephan Gouws and Anders Søgaard. 2015. Simple task-specific bilingual word embeddings. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1386--1390.Google Scholar
Cross Ref
- Jiang Guo, Wanxiang Che, Haifeng Wang, and Ting Liu. 2014. Learning sense-specific word embeddings by exploiting bilingual resources. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 497--507.Google Scholar
- Karl Moritz Hermann and Phil Blunsom. 2013. Multilingual distributed representations without word alignment. arXiv Preprint arXiv:1312.6173 (2013).Google Scholar
- Karl Moritz Hermann and Phil Blunsom. 2014. Multilingual models for compositional distributed semantics. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Baltimore, Maryland, 58--68.Google Scholar
Cross Ref
- Hermann O. Hirschfeld. 1935. A connection between correlation and contingency. In Mathematical Proceedings of the Cambridge Philosophical Society, Vol. 31. Cambridge University Press, 520--524.Google Scholar
Cross Ref
- Eric Huang, Richard Socher, Christopher Manning, and Andrew Ng. 2012. Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 873--882. Google Scholar
Digital Library
- Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. 2015. SensEmbed: Learning sense embeddings for word and relational similarity. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 95--105.Google Scholar
Cross Ref
- Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. Embeddings for word sense disambiguation: An evaluation study. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin, Germany, 897--907.Google Scholar
Cross Ref
- Sujay Kumar Jauhar, Chris Dyer, and Eduard Hovy. 2015. Ontologically grounded multi-sense representation learning for semantic vector space models. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 683--693.Google Scholar
Cross Ref
- Sébastien Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio. 2015. On using very large target vocabulary for neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1--10.Google Scholar
Cross Ref
- Hyungsuk Ji and Sabine Ploux. 2003. Lexical knowledge representation with contexonyms. In Proceedings of the 9th Machine Translation Summit. Association for Machine Translation in the Americas, 194--201.Google Scholar
- Richard M. Karp. 1972. Reducibility among combinatorial problems. Complexity of Computer Computations (1972), 85--103.Google Scholar
- Alexandre Klementiev, Ivan Titov, and Binod Bhattarai. 2012. Inducing crosslingual distributed representations of words. Citeseer (2012).Google Scholar
- Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 388--395.Google Scholar
- Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. 177--180. Google Scholar
Digital Library
- Philipp Koehn and Rebecca Knowles. 2017. Six challenges for neural machine translation. CoRR abs/1706.03872 (2017).Google Scholar
- Tomáš Kočiský, Karl Moritz Hermann, and Phil Blunsom. 2014. Learning bilingual word representations by marginalizing alignments. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 224--229.Google Scholar
Cross Ref
- Thomas K. Landauer and Susan T. Dutnais. 1997. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review (1997), 211--240.Google Scholar
- Stanislas Lauly, Hugo Larochelle, Mitesh Khapra, Balaraman Ravindran, Vikas C. Raykar, and Amrita Saha. 2014. An autoencoder approach to learning bilingual word representations. In Advances in Neural Information Processing Systems. 1853--1861. Google Scholar
Digital Library
- Rémi Lebret and Ronan Collobert. 2014. Word embeddings through Hellinger PCA. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 482--490.Google Scholar
Cross Ref
- Omer Levy and Yoav Goldberg. 2014. Linguistic regularities in sparse and explicit word representations. In Proceedings of the 18th Conference on Computational Natural Language Learning. 171--180.Google Scholar
Cross Ref
- Shaoshi Ling, Yangqiu Song, and Dan Roth. 2016. Word embeddings with limited memory. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 387--392.Google Scholar
Cross Ref
- Ang Lu, Weiran Wang, Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2015. Deep multilingual correlation for improved word embeddings. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 250--256.Google Scholar
Cross Ref
- R. Duncan Luce and Albert D. Perry. 1949. A method of matrix analysis of group structure. Psychometrika 14, 2 (1949), 95--116.Google Scholar
Cross Ref
- Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Bilingual word representations with monolingual quality in mind. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. 151--159.Google Scholar
- Haitao Mi, Zhiguo Wang, and Abe Ittycheriah. 2016. Vocabulary manipulation for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 124--129.Google Scholar
Cross Ref
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the Workshop at the 1st International Conference on Learning Representations (ICLR’13).Google Scholar
- Tomas Mikolov, Quoc V Le, and Ilya Sutskever. 2013. Exploiting similarities among languages for machine translation. arXiv Preprint arXiv:1309.4168 (2013).Google Scholar
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. Curran Associates, Inc., Stateline, Nevada, USA, 3111--3119. Google Scholar
Digital Library
- George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine J Miller. 1990. Introduction to wordnet: An on-line lexical database. International Journal of Lexicography 3, 4 (1990), 235--244.Google Scholar
Cross Ref
- Aditya Mogadala and Achim Rettinger. 2016. Bilingual word embeddings from parallel and non-parallel corpora for cross-language text classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 692--702.Google Scholar
Cross Ref
- Toshiaki Nakazawa, Shohei Higashiyama, Chenchen Ding, Hideya Mino, Isao Goto, Hideto Kazawa, Yusuke Oda, Graham Neubig, and Sadao Kurohashi. 2017. Overview of the 4th workshop on Asian translation. In Proceedings of the 4th Workshop on Asian Translation (WAT2017). Asian Federation of Natural Language Processing, Taipei, Taiwan, 1--54.Google Scholar
- Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. 160--167. Google Scholar
Digital Library
- Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics 29, 1 (2003), 19--51. Google Scholar
Digital Library
- Takamasa Oshikiri, Kazuki Fukui, and Hidetoshi Shimodaira. 2016. Cross-lingual word representations via spectral graph embeddings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 493--498.Google Scholar
Cross Ref
- Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.Google Scholar
Cross Ref
- Karl Person. 1901. On lines and planes of closest fit to system of points in space. Philiosophical Magazine (1901).Google Scholar
- Yuval Pinter, Robert Guthrie, and Jacob Eisenstein. 2017. Mimicking word embeddings using subword RNNs. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 102--112.Google Scholar
Cross Ref
- Sabine Ploux and Hyungsuk Ji. 2003. A model for matching semantic maps between languages (French/English, English/French). Computational Linguistics 29, 2 (June 2003), 155--178. Google Scholar
Digital Library
- Sascha Rothe and Hinrich Schütze. 2016. Word embedding calculus in meaningful ultradense subspaces. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 512--517.Google Scholar
Cross Ref
- Avneesh Saluja, Hany Hassan, Kristina Toutanova, and Chris Quirk. 2014. Graph-based semi-supervised learning of translation models from monolingual data. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 676--686.Google Scholar
Cross Ref
- Joseph Sanu, Mingbin Xu, Hui Jiang, and Quan Liu. 2017. Word embeddings based on fixed-size ordinally forgetting encoding. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 310--315.Google Scholar
Cross Ref
- Tobias Schnabel, Igor Labutov, David Mimno, and Thorsten Joachims. 2015. Evaluation methods for unsupervised word embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 298--307.Google Scholar
Cross Ref
- Holger Schwenk. 2012. Continuous space translation models for phrase-based statistical machine translation. In Proceedings of 24th International Conference on Computational Linguistics: Posters. 1071--1080.Google Scholar
- Tianze Shi, Zhiyuan Liu, Yang Liu, and Maosong Sun. 2015. Learning cross-lingual word embeddings via matrix co-factorization. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 567--572.Google Scholar
Cross Ref
- Samuel L. Smith, David H. P. Turban, Steven Hamblin, and Nils Y. Hammerla. 2017. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. arXiv (2017).Google Scholar
- Andreas Stolcke. 2002. SRILM-an extensible language modeling toolkit. In Proceedings of the 7th International Conference on Spoken Language Processing. Denver, Colorado, USA, 257--286.Google Scholar
- Martin Sundermeyer, Tamer Alkhouli, Joern Wuebker, and Hermann Ney. 2014. Translation modeling with bidirectional recurrent neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 14--25.Google Scholar
Cross Ref
- Simon Šuster, Ivan Titov, and Gertjan van Noord. 2016. Bilingual learning of multi-sense embeddings with discrete autoencoders. arXiv (2016).Google Scholar
- Ilya Sutskever, Oriol Vinyals, and Quoc V. V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems. 3104--3112. Google Scholar
Digital Library
- Julien Tissier, Christopher Gravier, and Amaury Habrard. 2017. Dict2vec : Learning word embeddings using lexical dictionaries. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 254--263.Google Scholar
Cross Ref
- Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, and Hang Li. 2016. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 76--85.Google Scholar
Cross Ref
- Shyam Upadhyay, Manaal Faruqui, Chris Dyer, and Dan Roth. 2016. Cross-lingual models of word embeddings: An empirical comparison. arXiv (2016).Google Scholar
- Ivan Vulić and Anna Korhonen. 2016. On the role of seed lexicons in learning bilingual word embeddings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 247--257.Google Scholar
Cross Ref
- Ivan Vulić and Marie-Francine Moens. 2015. Bilingual word embeddings from non-parallel document-aligned data applied to bilingual lexicon induction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Beijing, China, 719--725.Google Scholar
- Ekaterina Vylomova, Laura Rimell, Trevor Cohn, and Timothy Baldwin. 2015. Take and took, gaggle and goose, book and read: Evaluating the utility of vector differences for lexical relation learning. CoRR (2015).Google Scholar
- Rui Wang, Andrew Finch, Masao Utiyama, and Eiichiro Sumita. 2017. Sentence embedding for neural machine translation domain adaptation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Vancouver, Canada, 560--566.Google Scholar
Cross Ref
- Rui Wang, Hai Zhao, Sabine Ploux, Bao-Liang Lu, and Masao Utiyama. 2016. A bilingual graph-based semantic model for statistical machine translation. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2950--2956. Google Scholar
Digital Library
- Chao Xing, Dong Wang, Chao Liu, and Yiye Lin. 2015. Normalized word embedding and orthogonal transform for bilingual word translation. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1006--1011.Google Scholar
Cross Ref
- Wei Yang, Wei Lu, and Vincent Zheng. 2017. A simple regularization-based algorithm for learning cross-domain word embeddings. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2888--2894.Google Scholar
Cross Ref
- Jinxing Yu, Xun Jian, Hao Xin, and Yangqiu Song. 2017. Joint embeddings of chinese words, characters, and fine-grained subcharacter components. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 286--291.Google Scholar
Cross Ref
- Biao Zhang, Deyi Xiong, Jinsong Su, Hong Duan, and Min Zhang. 2016. Variational neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 521--530.Google Scholar
Cross Ref
- Jiajun Zhang, Shujie Liu, Mu Li, Ming Zhou, and Chengqing Zong. 2014. Bilingually-constrained phrase embeddings for machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 111--121.Google Scholar
Cross Ref
- Meng Zhang, Yang Liu, Huanbo Luan, Yiqun Liu, and Maosong Sun. 2016. Inducing bilingual lexica from non-parallel data with earth mover’s distance regularization. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, 3188--3198.Google Scholar
- Meng Zhang, Yang Liu, Huanbo Luan, and Maosong Sun. 2017. Adversarial training for unsupervised bilingual lexicon induction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1959--1970.Google Scholar
Cross Ref
- Meng Zhang, Yang Liu, Huanbo Luan, and Maosong Sun. 2017. Earth mover’s distance minimization for unsupervised bilingual lexicon induction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1924--1935.Google Scholar
Cross Ref
- Meng Zhang, Yang Liu, Huan-Bo Luan, Maosong Sun, Tatsuya Izuha, and Jie Hao. 2016. Building earth mover’s distance on bilingual word embeddings for machine translation. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2870--2876. Google Scholar
Digital Library
- Hao Zhou, Zhaopeng Tu, Shujian Huang, Xiaohua Liu, Hang Li, and Jiajun Chen. 2017. Chunk-based bi-scale decoder for neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Vancouver, Canada, 580--586.Google Scholar
Cross Ref
- Will Y. Zou, Richard Socher, Daniel Cer, and Christopher D. Manning. 2013. Bilingual word embeddings for phrase-based machine translation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1393--1398.Google Scholar
Index Terms
Graph-Based Bilingual Word Embedding for Statistical Machine Translation
Recommendations
Word Sense Based Hindi-Tamil Statistical Machine Translation
Corpus based natural language processing has emerged with great success in recent years. It is not only used for languages like English, French, Spanish, and Hindi but also is widely used for languages like Tamil, Telugu etc. This paper focuses to ...
Bilingual LSA-based adaptation for statistical machine translation
We propose a novel approach to cross-lingual language model and translation lexicon adaptation for statistical machine translation (SMT) based on bilingual latent semantic analysis. Bilingual LSA enables latent topic distributions to be efficiently ...
Statistical machine translation of subtitles for highly inflected language pair
This paper addresses the problem of statistical machine translation between highly inflected languages. Even when dealing with closely-related language pairs, statistical machine translation encounters problems if the parallel corpus is not big enough. ...






Comments