skip to main content
research-article

Graph-Based Bilingual Word Embedding for Statistical Machine Translation

Authors Info & Claims
Published:25 July 2018Publication History
Skip Abstract Section

Abstract

Bilingual word embedding has been shown to be helpful for Statistical Machine Translation (SMT). However, most existing methods suffer from two obvious drawbacks. First, they only focus on simple contexts such as an entire document or a fixed-sized sliding window to build word embedding and ignore latent useful information from the selected context. Second, the word sense but not the word should be the minimal semantic unit; however, most existing methods still use word representation.

To overcome these drawbacks, this article presents a novel Graph-Based Bilingual Word Embedding (GBWE) method that projects bilingual word senses into a multidimensional semantic space. First, a bilingual word co-occurrence graph is constructed using the co-occurrence and pointwise mutual information between the words. Then, maximum complete subgraphs (cliques), which play the role of a minimal unit for bilingual sense representation, are dynamically extracted according to the contextual information. Consequently, correspondence analysis, principal component analyses, and neural networks are used to summarize the clique-word matrix into lower dimensions to build the embedding model.

Without contextual information, the proposed GBWE can be applied to lexical translation. In addition, given contextual information, GBWE is able to give a dynamic solution for bilingual word representations, which can be applied to phrase translation and generation. Empirical results show that GBWE can enhance the performance of lexical translation, as well as Chinese/French-to-English and Chinese-to-Japanese phrase-based SMT tasks (IWSLT, NTCIR, NIST, and WAT).

References

  1. Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2016. Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2289--229.Google ScholarGoogle ScholarCross RefCross Ref
  2. Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2017. Learning bilingual word embeddings with (almost) no bilingual data. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 451--462.Google ScholarGoogle ScholarCross RefCross Ref
  3. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations. San Diego, USA.Google ScholarGoogle Scholar
  4. Marco Baroni, Georgiana Dinu, and Germán Kruszewski. 2014. Don’t count, predict! A systematic comparison of context-counting5 vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 238--247.Google ScholarGoogle ScholarCross RefCross Ref
  5. Jean-Paul Benzécri. 1973. L’Analyse des correspondances. In L’Analyse des Données, Vol. II. Dunod, Paris.Google ScholarGoogle Scholar
  6. Adam L. Berger, Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, John R. Gillett, John D. Lafferty, Robert L. Mercer, Harry Printz, and Luboš Ureš. 1994. The Candide system for machine translation. In Proceedings of the Workshop on Human Language Technology (HLT’94). 157--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Arianna Bisazza, Nick Ruiz, Marcello Federico, and FBK-Fondazione Bruno Kessler. 2011. Fill-up versus interpolation methods for phrase-based SMT adaptation. In Proceedings of the International Workshop on Spoken Language Translation. 136--143.Google ScholarGoogle Scholar
  8. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (2003), 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. John Adrian Bondy and Uppaluri Siva Ramachandra Murty. 1976. Graph Theory with Applications. Vol. 290. Macmillan, London.Google ScholarGoogle Scholar
  10. Peter F. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19, 2 (June 1993), 263--311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hailong Cao, Tiejun Zhao, Shu Zhang, and Yao Meng. 2016. A distribution-based model to learn bilingual word embeddings. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 1818--1827.Google ScholarGoogle Scholar
  12. Daniel Cer, Michel Galley, Daniel Jurafsky, and Christopher D. Manning. 2010. Phrasal: A statistical machine translation toolkit for exploring new model features. In Proceedings of the NAACL HLT 2010 Demonstration Session. 9--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Mauro Cettolo, Christian Girardi, and Marcello Federico. 2012. WIT: Web inventory of transcribed and translated talks. In Proceedings of the 16th Conference of the European Association for Machine Translation. 261--268.Google ScholarGoogle Scholar
  14. Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, and Tiejun Zhao. 2017. Syntax-directed attention for neural machine translation. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18).Google ScholarGoogle Scholar
  15. Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder--Decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 1724--1734.Google ScholarGoogle ScholarCross RefCross Ref
  16. Jonathan H. Clark, Chris Dyer, Alon Lavie, and Noah A. Smith. 2011. Better hypothesis testing for statistical machine translation: Controlling for optimizer instability. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 176--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jocelyn Coulmance, Jean-Marc Marty, Guillaume Wenzek, and Amine Benhalloum. 2015. Trans-gram, fast cross-lingual word-embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1109--1113.Google ScholarGoogle ScholarCross RefCross Ref
  18. Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, and John Makhoul. 2014. Fast and robust neural network joint models for statistical machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1370--1380.Google ScholarGoogle ScholarCross RefCross Ref
  19. Allyson Ettinger, Philip Resnik, and Marine Carpuat. 2016. Retrofitting sense-specific word vectors using parallel text. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1378--1383.Google ScholarGoogle ScholarCross RefCross Ref
  20. Manaal Faruqui and Chris Dyer. 2014. Improving vector space word representations using multilingual correlation. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 462--471.Google ScholarGoogle ScholarCross RefCross Ref
  21. Michel Galley, P. Chang, Daniel Cer, Jenny R. Finkel, and Christopher D. Manning. 2008. NIST open machine translation 2008 evaluation: Stanford University’s system description. In Unpublished Working Notes of the 2008 NIST Open Machine Translation Evaluation Workshop. Citeseer.Google ScholarGoogle Scholar
  22. Jianfeng Gao, Xiaodong He, Wen-tau Yih, and Li Deng. 2014. Learning continuous phrase representations for translation modeling. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 699--709.Google ScholarGoogle ScholarCross RefCross Ref
  23. Isao Goto, Bin Lu, Ka Po Chow, Eiichiro Sumita, and Benjamin K. Tsou. 2011. Overview of the patent machine translation task at the NTCIR-9 workshop. In Proceedings of NTCIR-9 Workshop Meeting. 559--578.Google ScholarGoogle Scholar
  24. Stephan Gouws, Yoshua Bengio, and Greg Corrado. 2015. Bilbowa: Fast bilingual distributed representations without word alignments. In Proceedings of the 32nd International Conference on Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Stephan Gouws and Anders Søgaard. 2015. Simple task-specific bilingual word embeddings. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1386--1390.Google ScholarGoogle ScholarCross RefCross Ref
  26. Jiang Guo, Wanxiang Che, Haifeng Wang, and Ting Liu. 2014. Learning sense-specific word embeddings by exploiting bilingual resources. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 497--507.Google ScholarGoogle Scholar
  27. Karl Moritz Hermann and Phil Blunsom. 2013. Multilingual distributed representations without word alignment. arXiv Preprint arXiv:1312.6173 (2013).Google ScholarGoogle Scholar
  28. Karl Moritz Hermann and Phil Blunsom. 2014. Multilingual models for compositional distributed semantics. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Baltimore, Maryland, 58--68.Google ScholarGoogle ScholarCross RefCross Ref
  29. Hermann O. Hirschfeld. 1935. A connection between correlation and contingency. In Mathematical Proceedings of the Cambridge Philosophical Society, Vol. 31. Cambridge University Press, 520--524.Google ScholarGoogle ScholarCross RefCross Ref
  30. Eric Huang, Richard Socher, Christopher Manning, and Andrew Ng. 2012. Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 873--882. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. 2015. SensEmbed: Learning sense embeddings for word and relational similarity. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 95--105.Google ScholarGoogle ScholarCross RefCross Ref
  32. Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. Embeddings for word sense disambiguation: An evaluation study. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin, Germany, 897--907.Google ScholarGoogle ScholarCross RefCross Ref
  33. Sujay Kumar Jauhar, Chris Dyer, and Eduard Hovy. 2015. Ontologically grounded multi-sense representation learning for semantic vector space models. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 683--693.Google ScholarGoogle ScholarCross RefCross Ref
  34. Sébastien Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio. 2015. On using very large target vocabulary for neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  35. Hyungsuk Ji and Sabine Ploux. 2003. Lexical knowledge representation with contexonyms. In Proceedings of the 9th Machine Translation Summit. Association for Machine Translation in the Americas, 194--201.Google ScholarGoogle Scholar
  36. Richard M. Karp. 1972. Reducibility among combinatorial problems. Complexity of Computer Computations (1972), 85--103.Google ScholarGoogle Scholar
  37. Alexandre Klementiev, Ivan Titov, and Binod Bhattarai. 2012. Inducing crosslingual distributed representations of words. Citeseer (2012).Google ScholarGoogle Scholar
  38. Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 388--395.Google ScholarGoogle Scholar
  39. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. 177--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Philipp Koehn and Rebecca Knowles. 2017. Six challenges for neural machine translation. CoRR abs/1706.03872 (2017).Google ScholarGoogle Scholar
  41. Tomáš Kočiský, Karl Moritz Hermann, and Phil Blunsom. 2014. Learning bilingual word representations by marginalizing alignments. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 224--229.Google ScholarGoogle ScholarCross RefCross Ref
  42. Thomas K. Landauer and Susan T. Dutnais. 1997. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review (1997), 211--240.Google ScholarGoogle Scholar
  43. Stanislas Lauly, Hugo Larochelle, Mitesh Khapra, Balaraman Ravindran, Vikas C. Raykar, and Amrita Saha. 2014. An autoencoder approach to learning bilingual word representations. In Advances in Neural Information Processing Systems. 1853--1861. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Rémi Lebret and Ronan Collobert. 2014. Word embeddings through Hellinger PCA. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 482--490.Google ScholarGoogle ScholarCross RefCross Ref
  45. Omer Levy and Yoav Goldberg. 2014. Linguistic regularities in sparse and explicit word representations. In Proceedings of the 18th Conference on Computational Natural Language Learning. 171--180.Google ScholarGoogle ScholarCross RefCross Ref
  46. Shaoshi Ling, Yangqiu Song, and Dan Roth. 2016. Word embeddings with limited memory. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 387--392.Google ScholarGoogle ScholarCross RefCross Ref
  47. Ang Lu, Weiran Wang, Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2015. Deep multilingual correlation for improved word embeddings. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 250--256.Google ScholarGoogle ScholarCross RefCross Ref
  48. R. Duncan Luce and Albert D. Perry. 1949. A method of matrix analysis of group structure. Psychometrika 14, 2 (1949), 95--116.Google ScholarGoogle ScholarCross RefCross Ref
  49. Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Bilingual word representations with monolingual quality in mind. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. 151--159.Google ScholarGoogle Scholar
  50. Haitao Mi, Zhiguo Wang, and Abe Ittycheriah. 2016. Vocabulary manipulation for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 124--129.Google ScholarGoogle ScholarCross RefCross Ref
  51. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the Workshop at the 1st International Conference on Learning Representations (ICLR’13).Google ScholarGoogle Scholar
  52. Tomas Mikolov, Quoc V Le, and Ilya Sutskever. 2013. Exploiting similarities among languages for machine translation. arXiv Preprint arXiv:1309.4168 (2013).Google ScholarGoogle Scholar
  53. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. Curran Associates, Inc., Stateline, Nevada, USA, 3111--3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine J Miller. 1990. Introduction to wordnet: An on-line lexical database. International Journal of Lexicography 3, 4 (1990), 235--244.Google ScholarGoogle ScholarCross RefCross Ref
  55. Aditya Mogadala and Achim Rettinger. 2016. Bilingual word embeddings from parallel and non-parallel corpora for cross-language text classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 692--702.Google ScholarGoogle ScholarCross RefCross Ref
  56. Toshiaki Nakazawa, Shohei Higashiyama, Chenchen Ding, Hideya Mino, Isao Goto, Hideto Kazawa, Yusuke Oda, Graham Neubig, and Sadao Kurohashi. 2017. Overview of the 4th workshop on Asian translation. In Proceedings of the 4th Workshop on Asian Translation (WAT2017). Asian Federation of Natural Language Processing, Taipei, Taiwan, 1--54.Google ScholarGoogle Scholar
  57. Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. 160--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics 29, 1 (2003), 19--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Takamasa Oshikiri, Kazuki Fukui, and Hidetoshi Shimodaira. 2016. Cross-lingual word representations via spectral graph embeddings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 493--498.Google ScholarGoogle ScholarCross RefCross Ref
  60. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.Google ScholarGoogle ScholarCross RefCross Ref
  61. Karl Person. 1901. On lines and planes of closest fit to system of points in space. Philiosophical Magazine (1901).Google ScholarGoogle Scholar
  62. Yuval Pinter, Robert Guthrie, and Jacob Eisenstein. 2017. Mimicking word embeddings using subword RNNs. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 102--112.Google ScholarGoogle ScholarCross RefCross Ref
  63. Sabine Ploux and Hyungsuk Ji. 2003. A model for matching semantic maps between languages (French/English, English/French). Computational Linguistics 29, 2 (June 2003), 155--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Sascha Rothe and Hinrich Schütze. 2016. Word embedding calculus in meaningful ultradense subspaces. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 512--517.Google ScholarGoogle ScholarCross RefCross Ref
  65. Avneesh Saluja, Hany Hassan, Kristina Toutanova, and Chris Quirk. 2014. Graph-based semi-supervised learning of translation models from monolingual data. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 676--686.Google ScholarGoogle ScholarCross RefCross Ref
  66. Joseph Sanu, Mingbin Xu, Hui Jiang, and Quan Liu. 2017. Word embeddings based on fixed-size ordinally forgetting encoding. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 310--315.Google ScholarGoogle ScholarCross RefCross Ref
  67. Tobias Schnabel, Igor Labutov, David Mimno, and Thorsten Joachims. 2015. Evaluation methods for unsupervised word embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 298--307.Google ScholarGoogle ScholarCross RefCross Ref
  68. Holger Schwenk. 2012. Continuous space translation models for phrase-based statistical machine translation. In Proceedings of 24th International Conference on Computational Linguistics: Posters. 1071--1080.Google ScholarGoogle Scholar
  69. Tianze Shi, Zhiyuan Liu, Yang Liu, and Maosong Sun. 2015. Learning cross-lingual word embeddings via matrix co-factorization. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 567--572.Google ScholarGoogle ScholarCross RefCross Ref
  70. Samuel L. Smith, David H. P. Turban, Steven Hamblin, and Nils Y. Hammerla. 2017. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. arXiv (2017).Google ScholarGoogle Scholar
  71. Andreas Stolcke. 2002. SRILM-an extensible language modeling toolkit. In Proceedings of the 7th International Conference on Spoken Language Processing. Denver, Colorado, USA, 257--286.Google ScholarGoogle Scholar
  72. Martin Sundermeyer, Tamer Alkhouli, Joern Wuebker, and Hermann Ney. 2014. Translation modeling with bidirectional recurrent neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 14--25.Google ScholarGoogle ScholarCross RefCross Ref
  73. Simon Šuster, Ivan Titov, and Gertjan van Noord. 2016. Bilingual learning of multi-sense embeddings with discrete autoencoders. arXiv (2016).Google ScholarGoogle Scholar
  74. Ilya Sutskever, Oriol Vinyals, and Quoc V. V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems. 3104--3112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Julien Tissier, Christopher Gravier, and Amaury Habrard. 2017. Dict2vec : Learning word embeddings using lexical dictionaries. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 254--263.Google ScholarGoogle ScholarCross RefCross Ref
  76. Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, and Hang Li. 2016. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 76--85.Google ScholarGoogle ScholarCross RefCross Ref
  77. Shyam Upadhyay, Manaal Faruqui, Chris Dyer, and Dan Roth. 2016. Cross-lingual models of word embeddings: An empirical comparison. arXiv (2016).Google ScholarGoogle Scholar
  78. Ivan Vulić and Anna Korhonen. 2016. On the role of seed lexicons in learning bilingual word embeddings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 247--257.Google ScholarGoogle ScholarCross RefCross Ref
  79. Ivan Vulić and Marie-Francine Moens. 2015. Bilingual word embeddings from non-parallel document-aligned data applied to bilingual lexicon induction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Beijing, China, 719--725.Google ScholarGoogle Scholar
  80. Ekaterina Vylomova, Laura Rimell, Trevor Cohn, and Timothy Baldwin. 2015. Take and took, gaggle and goose, book and read: Evaluating the utility of vector differences for lexical relation learning. CoRR (2015).Google ScholarGoogle Scholar
  81. Rui Wang, Andrew Finch, Masao Utiyama, and Eiichiro Sumita. 2017. Sentence embedding for neural machine translation domain adaptation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Vancouver, Canada, 560--566.Google ScholarGoogle ScholarCross RefCross Ref
  82. Rui Wang, Hai Zhao, Sabine Ploux, Bao-Liang Lu, and Masao Utiyama. 2016. A bilingual graph-based semantic model for statistical machine translation. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2950--2956. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Chao Xing, Dong Wang, Chao Liu, and Yiye Lin. 2015. Normalized word embedding and orthogonal transform for bilingual word translation. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1006--1011.Google ScholarGoogle ScholarCross RefCross Ref
  84. Wei Yang, Wei Lu, and Vincent Zheng. 2017. A simple regularization-based algorithm for learning cross-domain word embeddings. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2888--2894.Google ScholarGoogle ScholarCross RefCross Ref
  85. Jinxing Yu, Xun Jian, Hao Xin, and Yangqiu Song. 2017. Joint embeddings of chinese words, characters, and fine-grained subcharacter components. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 286--291.Google ScholarGoogle ScholarCross RefCross Ref
  86. Biao Zhang, Deyi Xiong, Jinsong Su, Hong Duan, and Min Zhang. 2016. Variational neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 521--530.Google ScholarGoogle ScholarCross RefCross Ref
  87. Jiajun Zhang, Shujie Liu, Mu Li, Ming Zhou, and Chengqing Zong. 2014. Bilingually-constrained phrase embeddings for machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 111--121.Google ScholarGoogle ScholarCross RefCross Ref
  88. Meng Zhang, Yang Liu, Huanbo Luan, Yiqun Liu, and Maosong Sun. 2016. Inducing bilingual lexica from non-parallel data with earth mover’s distance regularization. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, 3188--3198.Google ScholarGoogle Scholar
  89. Meng Zhang, Yang Liu, Huanbo Luan, and Maosong Sun. 2017. Adversarial training for unsupervised bilingual lexicon induction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1959--1970.Google ScholarGoogle ScholarCross RefCross Ref
  90. Meng Zhang, Yang Liu, Huanbo Luan, and Maosong Sun. 2017. Earth mover’s distance minimization for unsupervised bilingual lexicon induction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1924--1935.Google ScholarGoogle ScholarCross RefCross Ref
  91. Meng Zhang, Yang Liu, Huan-Bo Luan, Maosong Sun, Tatsuya Izuha, and Jie Hao. 2016. Building earth mover’s distance on bilingual word embeddings for machine translation. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2870--2876. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Hao Zhou, Zhaopeng Tu, Shujian Huang, Xiaohua Liu, Hang Li, and Jiajun Chen. 2017. Chunk-based bi-scale decoder for neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Vancouver, Canada, 580--586.Google ScholarGoogle ScholarCross RefCross Ref
  93. Will Y. Zou, Richard Socher, Daniel Cer, and Christopher D. Manning. 2013. Bilingual word embeddings for phrase-based machine translation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1393--1398.Google ScholarGoogle Scholar

Index Terms

  1. Graph-Based Bilingual Word Embedding for Statistical Machine Translation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 17, Issue 4
      December 2018
      193 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3229525
      Issue’s Table of Contents

      Copyright © 2018 Owner/Author

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 July 2018
      • Accepted: 1 March 2018
      • Revised: 1 January 2018
      • Received: 1 October 2017
      Published in tallip Volume 17, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!