Abstract
Frequently corresponding to syntactic components, the Maximal-length Noun Phrase (MNP) possesses abundant syntactic and semantic information and acts a certain semantic role in sentences. Recognition of MNP plays an important role in Natural Language Processing and lays the foundation for analyzing and understanding sentence structure and semantics. By comparing the essence of different MNPs, this article defines the MNP in the Tibetan language from the perspective of syntax tree. A total of 6,038 sentences are extracted from the syntax tree corpus, the structure type, boundary feature, and frequency of MNPs are analyzed, and the MNPs are recognized by applying the sequence tagging model and the syntactic analysis model. The accuracy, recall, and F1 score of the recognition results of applying sequence tagging model are 87.14%, 84.72%, and 85.92%, respectively. The accuracy, recall, and F1 score of the recognition results of applying syntactic analysis model are 87.66%, 87.63%, and 87.65%, respectively.
- Long Congjun and Liu Huidan. 2016. Zang Wen Zi Dong Fen Ci De Li Lun Yu Fang Fa Yan Jiu (Study on the theory and method of Tibetan auto word-segmentation). Intellectual Property Publishing House.Google Scholar
- Li Bohan, Liu Huidan, Long Congjun, and Wu Jian. 2018. Ji Yu Sheng Du Xue Xi De Zang Wen Fen Ci Fang Fa (Tibetan word segmentation based on deep learning). Comput. Eng. Des. 39, 01 (2018), 194–198.Google Scholar
- Edward Garrett, Nathan W. Hill, and Abel Zadoks. 2014. A rule-based part-of-speech tagger for classical Tibetan. Himal. Ling. 13, 1 (2014), 9–57.Google Scholar
- Congjun Long, Huidan Liu, Minghua Nuo, and Jian Wu. 2015. Ji Yu Zang Yu Zi Xing Biao Zhu De Ci Xing Yu Ce Yan Jiu (Tibetan POS tagging based on syllable tagging). Journal of Chinese Information Processing 29, 5 (2015), 211–216.Google Scholar
- Department of Applied Ethnic Linguistics, Institute of Ethnology and Anthropology, Chinese Academy of Social Sciences. http://tibetan.vurls.cn/tool/segpos/.Google Scholar
- Zhao Weina. 2012. Ji Yu Fa Lv Wen Ben De Zang Yu Ju Zi Bian Jie Shi Bie (Legal texts based Tibetan sentence boundary recognition). Doctoral Dissertation. Beijing Language and Culture University.Google Scholar
- Zhao Weina, Yu Xin, Liu Huidan, Li Lin, Wang Lei, and Wu Jian. 2013. Xian Dai Zang Yu Zhu Dong Ci Jie Wei Ju Zi Bian Jie Shi Bie Fang Fa (Boundary recognition of Tibetan sentences with final of auxiliary verb) J. Chinese Inf. Proc. 01 (2013), 115–119.Google Scholar
- Li Xiang, Cai Zangtai, Jiang Wenbin, Lv Yajuan, and Liu Qun. 2011. Zui Da Shang He Gui Ze Xiang Jie He De Zang Wen Ju Zi Bian Jie Shi Bie Fang Fa (A maximum entropy and rules approach to identifying Tibetan sentence boundaries). J. Chinese Inf. Proc. 04 (2011), 39–44.Google Scholar
- Cai Zangtai. 2012. Ji Yu Zui Da Shang Fen Lei Qi De Zang Wen Ju Zi Bian Jie Shi Bie Fang Fa Yan Jiu (Research on the automatic identification of Tibetan sentence boundaries with maximum entropy classifier). Comput. Eng. Sci. 06 (2012), 187–190.Google Scholar
- Yu Xin, Wu Jian, and Hong Jinling. 2011. Ji Yu Ci Dian De Han Zang Ju Zi Dui Qi Yan Jiu Yu Shi Xian (Research on lexicon-based Chinese-Tibetan sentence alignment). J. Chinese Inf. Proc. 04 (2011), 57–62.Google Scholar
- Long Congjun. 2016. On Tibetan Syntax Tree Bank Tagging. Postdoctoral Dissertation. Chinese Academy of Sciences University.Google Scholar
- Huaque Cairang and Zhao Hai-xing. 2013. Tibetan text dependency syntactic analysis based on discriminant. Comput. Eng. 39, 4 (2013), 300–304.Google Scholar
- Tashi-Gyal and Duo-La. 2015. Zang Yu Yi Cun Shu Ku Gou Jian De Li Lun Yu Fang Fa Tan Xi (Theory and method of Tibetan dependency Treebank construction). J. Tibet Univ. (Nat. Sci. Ed.) 30, 2 (2015), 76–83.Google Scholar
- Jiang Di. 2003. Xian Dai Zang Yu Zu Kuai Fen Ci De Fang Fa He Guo Cheng (Contemporary Tibetan language chunk segmentation approach and practice). Minor. Lang. China 04 (2003), 30–39.Google Scholar
- Long Congjun. 2014. Zang Yu Yu Yi Jue Se Zi Dong Biao Zhu Yan Jiu (The study on annotation of Tibetan semantics role). Doctoral Dissertation. Central Minzu University.Google Scholar
- Wang Tianhang, Shi Shumin, Long Congjun, Huang Heyan, and Li Lin. 2014. Ji Yu Cuo Wu Qu Dong Xue Xi Ce Lue De Zang Yu Ju Fa Gong Neng Zhu Kuai Bian Jie Shi Bie (Tibetan chunking based on error-driven learning strategy). J. Chinese Inf. Proc. 28, 05 (2014), 170–175+191.Google Scholar
- Li Lin. 2014. Recognition and Tagging of Tibetan Syntactic Chunk Boundary. Doctoral Dissertation. Chinese Academy of Social Sciences.Google Scholar
- Huidan Liu, Congjun Long, Longlong Ma, and Jian Wu. 2018. A collection of Tibetan text corpora. In Proceedings of the LREC2018 Workshop “Belt & Road: Language Resources and Evaluation.”Google Scholar
- Kuanghua Chen and Hsin-Hsi Chen. 1994. Extracting noun phrases from large-scale texts: A hybrid approach and its automatic evaluation. In Proceedings of the 32nd ACL Annual Meeting. 234–241. Google Scholar
Digital Library
- Zhou Qiang, Sun Maosong, and Huang Changning. 2000. Han Yu Zui Chang Ming Ci Duan Yu De Zi Dong Shi Bie (Auto recognition of Chinese maximum noun phrase). J. Softw. 11, 2 (2000), 195–201.Google Scholar
- Qian Xiaofei and Hou Min. 2017. Mian Xiang Xin Xi Chu Li De Han Yu Zui Chang Ming Ci Duan Yu Jie Ding Yan Jiu (Research on Chinese maximal noun phrase classification for information processing). Appl. Ling. 2 (2017), 127–134.Google Scholar
- Philipp Koehn and Kevin Knight. 2003. Feature-rich statistical translation of noun phrases. In Proceedings of the 41st Meeting of the Association, for Computational Linguistics. 311–318. Google Scholar
Digital Library
Index Terms
Recognition of Tibetan Maximal-length Noun Phrases Based on Syntax Tree
Recommendations
Identification of Maximal-Length Noun Phrases Based on Maximal-Length Preposition Phrases in Chinese
IALP '10: Proceedings of the 2010 International Conference on Asian Language ProcessingThe paper proposes an identification method of Maximal-Length Noun Phrase (MNP) based on Maximal-Length Preposition Phrase (MPP). We identify MNP utilizing the mutual restricting characteristic of MNP and adverbial MPP. We employ Conditional Random ...
Unsupervised Method for Parsing Coordinated Base Noun Phrases
CICLing '07: Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text ProcessingSyntactic parsing is an important processing step for various language processing applications including Information Extraction, Question Answering, and Machine Translation. Parsing base Noun Phrases is one particular parsing issue that is not handled ...
Phrase-based SMT with shallow Tree-Phrases
StatMT '06: Proceedings of the Workshop on Statistical Machine TranslationIn this article, we present a translation system which builds translations by gluing together Tree-Phrases, i.e. associations between simple syntactic dependency treelets in a source language and their corresponding phrases in a target language. The ...






Comments