Abstract
Dependency parsing is an important task for Natural Language Processing (NLP). However, a mature parser requires a large treebank for training, which is still extremely costly to create. Tibetan is a kind of extremely low-resource language for NLP, there is no available Tibetan dependency treebank, which is currently obtained by manual annotation. Furthermore, there are few related kinds of research on the construction of treebank. We propose a novel method of multi-level chunk-based syntactic parsing to complete constituent-to-dependency treebank conversion for Tibetan under scarce conditions. Our method mines more dependencies of Tibetan sentences, builds a high-quality Tibetan dependency tree corpus, and makes fuller use of the inherent laws of the language itself. We train the dependency parsing models on the dependency treebank obtained by the preliminary transformation. The model achieves 86.5% accuracy, 96% LAS, and 97.85% UAS, which exceeds the optimal results of existing conversion methods. The experimental results show that our method has the potential to use a low-resource setting, which means we not only solve the problem of scarce Tibetan dependency treebank but also avoid needless manual annotation. The method embodies the regularity of strong knowledge-guided linguistic analysis methods, which is of great significance to promote the research of Tibetan information processing.
- Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, and Michael Collins. 2016. Globally normalized transition-based neural networks. arXiv:1603.06042. Retrieved from https://arxiv.org/abs/1603.06042. Google Scholar
Digital Library
- Xiang Maoji and Anjian Cairang. 2018. Research and implementation of tibetan syntactic analysis system based on top-down parsing algorithm[J]. Information and Communication 8 (2018), 92--93. Google Scholar
Digital Library
- Razvan C. Bunescu and Raymond J. Mooney. 2005. A shortest path dependency kernel for relation extraction. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT'05). Association for Computational Linguistics, pages 724–731. Google Scholar
Digital Library
- Huaque Cairang, Jiang Wenbin, Zhao Haixing, and Liu Qun. 2013. Semi-automatic building tibetan treebank based on word-pair dependency classification[J]. Journal of Chinese Information Processing 27, 5 (2013), 166--172.Google Scholar
- Danqi Chen and Christopher Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). pages 740–750.Google Scholar
Cross Ref
- Michael Collins. 2003. Head-driven statistical models for natural language parsing. Comput. Ling. 29, 4 (2003), 589–637.Google Scholar
Digital Library
- Long Congjun, Liu Huidan, and Zhou Maoke. 2019. Longest noun phrases detection in tibetan[J]. J. Chin. Inf. Process. 33, 2 (2019), 59–66.Google Scholar
- Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan, and Tat-Seng Chua. 2005. Question answering passage retrieval using dependency relations. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’05). ACM, New York, NY, pp. 400–407.Google Scholar
Digital Library
- Jiang Di. 2003. Syntactic blocks and formal marks in modern tibetan (in chinese). In Language Computing and Content-based text Processing. Tsinghua University Press, Beijing, pp. 160–166. Google Scholar
Digital Library
- Timothy Dozat and Christopher D. Manning. 2016. Deep biaffine attention for neural dependency parsing. arXiv:1611.01734. Retrieved from https://arxiv.org/abs/1611.01734Google Scholar
- Timothy Dozat and Christopher D. Manning. 2018. Simpler but more accurate semantic dependency parsing. arXiv:1807.01396. Retrieved from https://arxiv.org/abs/1807.01396.Google Scholar
- Long Duong, Trevor Cohn, Steven Bird, and Paul Cook. 2015. A neural network model for low-resource universal dependency parsing. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. pp. 339–348.Google Scholar
Cross Ref
- Carlos Gómez-Rodríguez, Tianze Shi, and Lillian Lee. 2018. Global transition-based non-projective dependency parsing. arXiv:1807.01745. Retrieved from https://arxiv.org/abs/1807.01745.Google Scholar
- Tashi Gyal and Duo La. 2015. Research center of tibetan information technology, tibet university; Northwest university for nationalities; Theory and method of tibetan dependency treebank construction[J]. Tibet. Univ. 30, 2 (2015), 76--83.Google Scholar
- Hua quecairang and Zhao Haixing. 2013. Tibetan text dependency syntactic analysis based on discriminant[J]. Computer Engineering 39, 4 (2013), 300--304.Google Scholar
- Tao Ji, Yuanbin Wu, and Man Lan. 2019. Graph-based dependency parsing with graph neural networks. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’19). 2475–2485.Google Scholar
Cross Ref
- D. Klein and C. D. Manning. 2002. Fast exact inference with a factored model for natural language Parsing. NIPS. 3--10.Google Scholar
- David M. Magerman. 1994. Natural language parsing as statistical pattern recognition. arXiv:cmp-lg/9405009. Retrieved from http://arxiv.org/abs/cmp-lg/9405009.Google Scholar
- Ryan McDonald, Fernando Pereira, Kiril Ribarov, and Jan Hajic. 2005. Non-projective dependency parsing using spanning tree algorithms. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 523–530Google Scholar
Digital Library
- Joakim Nivre. 2015. Towards a universal grammar for natural language Processing. CICLing 1. 3--16.Google Scholar
- Joakim Nivre, Johan Hall, and Jens Nilsson. 2006. Maltparser: A data-driven parser-generator for dependency parsing. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’06), Vol. 6. 2216–2219.Google Scholar
- Wenzhe Pei, Tao Ge, and Baobao Chang. 2015. An effective neural network model for graph-based dependency parsing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’15). 313–32.Google Scholar
Cross Ref
- Michael Sejr Schlichtkrull and Anders Søgaard. 2017. Cross-lingual dependency parsing with late decoding for truly low-resource languages. arXiv:1701.01623. Retrieved from https://arxiv.org/abs/1701.01623. Google Scholar
Digital Library
- Tianze Shi, Liang Huang, and Lillian Lee. 2017a. Fast(er) exact decoding and global training for transition-based dependency parsing via a minimal feature set. arXiv:1708.09403. Retrieved from https://arxiv.org/abs/1708.09403.Google Scholar
- Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 1631–1642.Google Scholar
- Wang Tianhang, Shi Shumin, Long Congjun, Huang Heyan, and Li Lin. 2014. Tibetan chunking based on error-driven learning startegy (in chinese). J. Chin. Inf. Process. 28, 05 (2014), 170–175+191. Google Scholar
Digital Library
- Xinyu Wang, Jingxian Huang, and Kewei Tu. 2019. Second-order semantic dependency parsing with end-to-end neural networks. arXiv:1906.07880. Retrieved from https://arxiv.org/abs/1906.07880.Google Scholar
- Wenhui Wang and Baobao Chang. 2016. Graph based dependency parsing with bidirectional LSTM. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’16).Google Scholar
Cross Ref
- David Weiss, Chris Alberti, Michael Collins, and Slav Petrov. 2015. Structured training for neural network transition-based parsing. arXiv:1506.06158. Retrieved from https://arxiv.org/abs/1506.06158.Google Scholar
- Peng Xu, Jaeho Kang, Michael Ringgaard, and Franz Och. 2009. Using a dependency parser to improve smt for subject-object-verb languages. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL’09). Association for Computational Linguistics, pp. 245–253.Google Scholar
Cross Ref
- Hiroyasu Yamada and Yuji Matsumoto. 2003. Statistical dependency analysis with support vector machines. In Proceedings of the 8th International Conference on Parsing Technologies. pp. 195–206.Google Scholar
Index Terms
Multi-level Chunk-based Constituent-to-Dependency Treebank Transformation for Tibetan Dependency Parsing
Recommendations
Bitext Dependency Parsing With Auto-Generated Bilingual Treebank
This paper proposes a method to improve the accuracy of bilingual texts (bitexts) dependency parsing by using an auto-generated bilingual treebank created with the help of statistical machine translation (SMT) systems. Previous bitext parsing methods ...
Improve Chinese Semantic Dependency Parsing via Syntactic Dependency Parsing
IALP '12: Proceedings of the 2012 International Conference on Asian Language ProcessingWe address the problem of Chinese semantic dependency parsing. Dependency parsing is traditionally oriented to syntax analysis, which we denote by syntactic dependency parsing to distinguish it from semantic dependency parsing. In this paper, firstly we ...
Treebank grammar techniques for non-projective dependency parsing
EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational LinguisticsAn open problem in dependency parsing is the accurate and efficient treatment of non-projective structures. We propose to attack this problem using chart-parsing algorithms developed for mildly context-sensitive grammar formalisms. In this paper, we ...






Comments