skip to main content
research-article

Cross-lingual Adaptation Using Universal Dependencies

Authors Info & Claims
Published:26 May 2021Publication History
Skip Abstract Section

Abstract

We describe a cross-lingual adaptation method based on syntactic parse trees obtained from the Universal Dependencies (UD), which are consistent across languages, to develop classifiers in low-resource languages. The idea of UD parsing is to capture similarities as well as idiosyncrasies among typologically different languages. In this article, we show that models trained using UD parse trees for complex NLP tasks can characterize very different languages. We study two tasks of paraphrase identification and relation extraction as case studies. Based on UD parse trees, we develop several models using tree kernels and show that these models trained on the English dataset can correctly classify data of other languages, e.g., French, Farsi, and Arabic. The proposed approach opens up avenues for exploiting UD parsing in solving similar cross-lingual tasks, which is very useful for languages for which no labeled data is available.

References

  1. Basant Agarwal, Heri Ramampiaro, Helge Langseth, and Massimiliano Ruocco. 2018. A deep network model for paraphrase detection in short text messages. Inf. Proc. Manag. 54, 6 (2018), 922–937.Google ScholarGoogle ScholarCross RefCross Ref
  2. Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, and Noah A. Smith. 2016. Massively multilingual word embeddings. arXiv preprint arXiv:1602.01925 (2016).Google ScholarGoogle Scholar
  3. Behrouz Bokharaeian, Alberto Diaz, Nasrin Taghizadeh, Hamidreza Chitsaz, and Ramyar Chavoshinejad. 2017. SNPPhenA: A corpus for extracting ranked associations of single-nucleotide polymorphisms and phenotypes from literature. J. Biomed. Semant. 8, 1 (2017), 14.Google ScholarGoogle ScholarCross RefCross Ref
  4. Rui Cai, Xiaodong Zhang, and Houfeng Wang. 2016. Bidirectional recurrent convolutional neural network for relation classification. In Proceedings of the 54th Meeting of the Association for Computational Linguistics. 756–765.Google ScholarGoogle ScholarCross RefCross Ref
  5. Daniel Cer, Mona Diab, Eneko Agirre, Iñigo Lopez-Gazpio, and Lucia Specia. 2017. SemEval-2017 Task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval’17). 1–14.Google ScholarGoogle ScholarCross RefCross Ref
  6. Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Hervé Jégou. 2017. Word translation without parallel data. arXiv preprint arXiv:1710.04087 (2017).Google ScholarGoogle Scholar
  7. Danilo Croce, Alessandro Moschitti, and Roberto Basili. 2011. Structured lexical similarity via convolution kernels on dependency trees. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 1034–1046. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Marie-Catherine De Marneffe, Timothy Dozat, Natalia Silveira, Katri Haverinen, Filip Ginter, Joakim Nivre, and Christopher D. Manning. 2014. Universal Stanford dependencies: A cross-linguistic typology. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14). 4585–92.Google ScholarGoogle Scholar
  9. M. H. Dehghan, Mohammad Molla-Abbasi, and Heshaam Faili. 2018. Toward a multi-representation Persian treebank. In Proceedings of the 9th International Symposium on Telecommunication (ISP’18).Google ScholarGoogle ScholarCross RefCross Ref
  10. Kuntal Dey, Ritvik Shrivastava, and Saroj Kaushik. 2016. A paraphrase and semantic similarity detection system for user generated short-text content on microblogs. In Proceedings of the 26th International Conference on Computational Linguistics (COLING’16). 2880–2890.Google ScholarGoogle Scholar
  11. George R. Doddington, Alexis Mitchell, Mark A. Przybocki, Lance A. Ramshaw, Stephanie Strassel, and Ralph M. Weischedel. 2004. The automatic content extraction (ACE) program-tasks, data, and evaluation. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’04). 837–840.Google ScholarGoogle Scholar
  12. Bill Dolan, Chris Quirk, and Chris Brockett. 2004. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In Proceedings of the 20th International Conference on Computational Linguistics. Association for Computational Linguistics, 350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Rafael Ferreira, George D. C. Cavalcanti, Fred Freitas, Rafael Dueire Lins, Steven J. Simske, and Marcelo Riss. 2018. Combining sentence similarities measures to identify paraphrases. Comput. Speech Lang. 47 (2018), 59–73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Simone Filice, Giuseppe Castellucci, Danilo Croce, and Roberto Basili. 2015. KeLP: A kernel-based learning platform for natural language processing. In Proceedings of the Joint Conference of the Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing (ACL-IJCNLP’15). 19–24.Google ScholarGoogle ScholarCross RefCross Ref
  15. Simone Filice, Giovanni Da San Martino, and Alessandro Moschitti. 2015. Structural representations for learning relations between pairs of texts. In Proceedings of the Meeting of the Association for Computational Linguistics. 1003–1013.Google ScholarGoogle ScholarCross RefCross Ref
  16. Roxana Girju, Preslav Nakov, Vivi Nastase, Stan Szpakowicz, Peter Turney, and Deniz Yuret. 2007. Semeval-2007 task 04: Classification of semantic relations between nominals. In Proceedings of the 4th International Workshop on Semantic Evaluations. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Daniel Grießhaber, Ngoc Thang Vu, and Johannes Maucher. 2020. Low-resource text classification using domain-adversarial learning. Comput. Speech Lang. 62 (2020), 101056.Google ScholarGoogle ScholarCross RefCross Ref
  18. Jiang Guo, Wanxiang Che, David Yarowsky, Haifeng Wang, and Ting Liu. 2015. Cross-lingual dependency parsing based on distributed representations. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 1234–1244.Google ScholarGoogle ScholarCross RefCross Ref
  19. Kazuma Hashimoto, Pontus Stenetorp, Makoto Miwa, and Yoshimasa Tsuruoka. 2015. Task-oriented learning of word embeddings for semantic relation classification. In Proceedings of the 19th Conference on Computational Natural Language Learning.Google ScholarGoogle ScholarCross RefCross Ref
  20. Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, Diarmuid Ó Séaghdha, Sebastian Padó, Marco Pennacchiotti, Lorenza Romano, and Stan Szpakowicz. 2010. Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions. 94–99s. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional neural network architectures for matching natural language sentences. Adv. Neural Inf. Proc. Syst. 27 (2014), 2042–2050. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Nitin Indurkhya and Fred J. Damerau. 2010. Handbook of Natural Language Processing. Vol. 2. CRC Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Yangfeng Ji and Jacob Eisenstein. 2013. Discriminative improvements to distributional sentence similarity. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Seattle, Washington, 891–896.Google ScholarGoogle Scholar
  24. Sylvain Kahane, Marine Courtin, and Kim Gerdes. 2017. Multi-word annotation in syntactic treebanks-propositions for universal dependencies. In Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories. 181–189.Google ScholarGoogle Scholar
  25. Tom Kenter and Maarten De Rijke. 2015. Short text similarity with word embeddings. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 1411–1420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Joo-Kyung Kim, Young-Bum Kim, Ruhi Sarikaya, and Eric Fosler-Lussier. 2017. Cross-lingual transfer learning for POS tagging without cross-lingual resources. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2832–2838.Google ScholarGoogle ScholarCross RefCross Ref
  27. Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  28. Alexandros Komninos and Suresh Manandhar. 2016. Dependency based embeddings for sentence classification tasks. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL: HLT’16). 1490–1500.Google ScholarGoogle ScholarCross RefCross Ref
  29. Wiem Lahbib, Ibrahim Bounhas, Bilel Elayeb, Fabrice Evrard, and Yahya Slimani. 2013. A hybrid approach for Arabic semantic relation extraction. In Proceedings of the 26th International Florida Artificial Intelligence Research Society Conference. 315–320.Google ScholarGoogle Scholar
  30. Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2267–2273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Hoang Quynh Le, Duy Cat Can, Quang Thuy Ha, and Nigel Collier. 2019. A richer-but-smarter shortest dependency path with attentive augmentation for relation extraction. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, Vol. 1. Association for Computational Linguistics, 2902–2912.Google ScholarGoogle Scholar
  32. Zuchao Li, Shexia He, Zhuosheng Zhang, and Hai Zhao. 2018. Joint learning of pos and dependencies for multilingual universal dependency parsing. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. 65–73.Google ScholarGoogle Scholar
  33. Nitin Madnani, Joel Tetreault, and Martin Chodorow. 2012. Re-examining machine translation metrics for paraphrase identification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 182–190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ryan McDonald, Joakim Nivre, Yvonne Quirmbach-Brundage, Yoav Goldberg, Dipanjan Das, Kuzman Ganchev, Keith Hall, Slav Petrov, Hao Zhang, Oscar Täckström et al. 2013. Universal dependency annotation for multilingual parsing. In Proceedings of the 51st Meeting of the Association for Computational Linguistics (ACL’13). 92–97.Google ScholarGoogle Scholar
  35. Ryan McDonald, Joakim Nivre, Yvonne Quirmbach-Brundage, Yoav Goldberg, Dipanjan Das, Kuzman Ganchev, Keith Hall, Slav Petrov, Hao Zhang, Oscar Täckström et al. 2013. Universal dependency annotation for multilingual parsing. In Proceedings of the 51st Meeting of the Association for Computational Linguistics. 92–97.Google ScholarGoogle Scholar
  36. Alessandro Moschitti. 2004. A study on convolution kernels for shallow semantic parsing. In Proceedings of the 42nd Meeting on the Association for Computational Linguistics (ACL’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Alessandro Moschitti. 2006. Efficient convolution kernels for dependency and constituent syntactic trees. In Proceedings of the 17th European Conference on Machine Learning (ECML’06). 318–329. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Truc-Vien T. Nguyen, Alessandro Moschitti, and Giuseppe Riccardi. 2009. Convolution kernels on constituent, dependency and sequential structures for relation extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’09). 1378–1387. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Joakim Nivre. 2016. Universal dependencies: A cross-linguistic perspective on grammar and lexicon. In Proceedings of the Workshop on Grammar and Lexicon: Interactions and Interfaces (GramLex’16). 38–40.Google ScholarGoogle Scholar
  40. Joakim Nivre and Chiao-Ting Fang. 2017. Universal dependency evaluation. In Proceedings of the NoDaLiDa Workshop on Universal Dependencies. 86–95.Google ScholarGoogle Scholar
  41. Matteo Pagliardini, Prakhar Gupta, and Martin Jaggi. 2017. Unsupervised learning of sentence embeddings using compositional n-gram features. arXiv preprint arXiv:1703.02507 (2017).Google ScholarGoogle Scholar
  42. Slav Petrov, Dipanjan Das, and Ryan McDonald. 2012. A universal part-of-speech tagset. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC’12). 2089–2096.Google ScholarGoogle Scholar
  43. Pengda Qin, Weiran Xu, and Jun Guo. 2016. An empirical convolutional neural network approach for semantic relation classification. Neurocomputing 190 (2016), 1–9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Pengda Qin, Weiran Xu, and Jun Guo. 2017. Designing an adaptive attention mechanism for relation classification. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’17). 4356–4362.Google ScholarGoogle ScholarCross RefCross Ref
  45. Pollet Samvelian and Pegah Faghiri. 2013. Introducing PersPred, a syntactic and semantic database for Persian complex predicates. In Proceedings of the 9th Workshop on Multiword Expressions. Association for Computational Linguistics, 11–20.Google ScholarGoogle Scholar
  46. Sebastian Schuster and Christopher D. Manning. 2016. Enhanced English universal dependencies: An improved representation for natural language understanding tasks. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 2371–2378.Google ScholarGoogle Scholar
  47. Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Chunyuan Li, Ricardo Henao, and Lawrence Carin. 2018. Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms. arXiv preprint arXiv:1805.09843 (2018).Google ScholarGoogle Scholar
  48. Milan Straka, Jan Hajic, Jana Straková, and Jan Hajic Jr. 2015. Parsing universal dependency treebanks using neural networks and search-based oracle. In Proceedings of the International Workshop on Treebanks and Linguistic Theories (TLT’15). 208–220.Google ScholarGoogle Scholar
  49. Ananya Subburathinam, Di Lu, Heng Ji, Jonathan May, Shih-Fu Chang, Avirup Sil, and Clare Voss. 2019. Cross-lingual structure transfer for relation and event extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 313–325.Google ScholarGoogle ScholarCross RefCross Ref
  50. Nasrin Taghizadeh and Hesham Faili. 2016. Automatic WordNet development for low-resource languages using cross-lingual WSD.J. Artif. Intell. Res. 56 (2016), 61–87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Nasrin Taghizadeh, Heshaam Faili, and Jalal Maleki. 2018. Cross-language learning for arabic relation extraction. Procedia Comput. Sci. 142 (2018), 190–197.Google ScholarGoogle ScholarCross RefCross Ref
  52. Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 1556–1566.Google ScholarGoogle Scholar
  53. Jörg Tiedemann. 2015. Cross-lingual dependency parsing with universal dependencies and predicted PoS labels. In Proceedings of the 3rd International Conference on Dependency Linguistics (Depling’15). 340–349.Google ScholarGoogle Scholar
  54. Jörg Tiedemann and Zeljko Agić. 2016. Synthetic treebanking for cross-lingual dependency parsing. J. Artif. Intell. Res. 55 (2016), 209–248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Christopher Walker, Stephanie Strassel, Julie Medero, and Kazuaki Maeda. 2005. ACE 2005 multilingual training corpus-linguistic data consortium. Retrieved from https://catalog. ldc. upenn. edu/LDC2006T06.Google ScholarGoogle Scholar
  56. Linlin Wang, Zhu Cao, Gerard de Melo, and Zhiyuan Liu. 2016. Relation classification via multi-level attention CNNs. In Proceedings of the 54th Meeting of the Association for Computational Linguistics. 1298–1307.Google ScholarGoogle ScholarCross RefCross Ref
  57. Zhiguo Wang, Haitao Mi, and Abraham Ittycheriah. 2016. Sentence similarity learning by lexical decomposition and composition. In Proceedings of the 26th International Conference on Computational Linguistics (COLING’16). 1340–1349.Google ScholarGoogle Scholar
  58. Kun Xu, Yansong Feng, Songfang Huang, and Dongyan Zhao. 2015. Semantic relation classification via convolutional neural networks with simple negative sampling. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 536–540.Google ScholarGoogle ScholarCross RefCross Ref
  59. Daniel Zeman, Jan Hajič, Martin Popel, Martin Potthast, Milan Straka, Filip Ginter, Joakim Nivre, and Slav Petrov. 2018. CoNLL 2018 shared task: Multilingual parsing from raw text to universal dependencies. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. 1–21.Google ScholarGoogle Scholar
  60. Jie Zhou, Gongshen Liu, and Huanrong Sun. 2018. Paraphrase identification based on weighted URAE, unit similarity and context correlation feature. In Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing. Springer, 41–53.Google ScholarGoogle ScholarCross RefCross Ref
  61. Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi, Bingchen Li, Hongwei Hao, and Bo Xu. 2016. Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th Meeting of the Association for Computational Linguistics (ACL’16). 207–212.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Cross-lingual Adaptation Using Universal Dependencies

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 4
      July 2021
      419 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3465463
      Issue’s Table of Contents

      Copyright © 2021 Association for Computing Machinery.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 May 2021
      • Accepted: 1 January 2021
      • Revised: 1 December 2020
      • Received: 1 November 2019
      Published in tallip Volume 20, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!