skip to main content
research-article

Recognition of Tibetan Maximal-length Noun Phrases Based on Syntax Tree

Authors Info & Claims
Published:30 March 2021Publication History
Skip Abstract Section

Abstract

Frequently corresponding to syntactic components, the Maximal-length Noun Phrase (MNP) possesses abundant syntactic and semantic information and acts a certain semantic role in sentences. Recognition of MNP plays an important role in Natural Language Processing and lays the foundation for analyzing and understanding sentence structure and semantics. By comparing the essence of different MNPs, this article defines the MNP in the Tibetan language from the perspective of syntax tree. A total of 6,038 sentences are extracted from the syntax tree corpus, the structure type, boundary feature, and frequency of MNPs are analyzed, and the MNPs are recognized by applying the sequence tagging model and the syntactic analysis model. The accuracy, recall, and F1 score of the recognition results of applying sequence tagging model are 87.14%, 84.72%, and 85.92%, respectively. The accuracy, recall, and F1 score of the recognition results of applying syntactic analysis model are 87.66%, 87.63%, and 87.65%, respectively.

References

  1. Long Congjun and Liu Huidan. 2016. Zang Wen Zi Dong Fen Ci De Li Lun Yu Fang Fa Yan Jiu (Study on the theory and method of Tibetan auto word-segmentation). Intellectual Property Publishing House.Google ScholarGoogle Scholar
  2. Li Bohan, Liu Huidan, Long Congjun, and Wu Jian. 2018. Ji Yu Sheng Du Xue Xi De Zang Wen Fen Ci Fang Fa (Tibetan word segmentation based on deep learning). Comput. Eng. Des. 39, 01 (2018), 194–198.Google ScholarGoogle Scholar
  3. Edward Garrett, Nathan W. Hill, and Abel Zadoks. 2014. A rule-based part-of-speech tagger for classical Tibetan. Himal. Ling. 13, 1 (2014), 9–57.Google ScholarGoogle Scholar
  4. Congjun Long, Huidan Liu, Minghua Nuo, and Jian Wu. 2015. Ji Yu Zang Yu Zi Xing Biao Zhu De Ci Xing Yu Ce Yan Jiu (Tibetan POS tagging based on syllable tagging). Journal of Chinese Information Processing 29, 5 (2015), 211–216.Google ScholarGoogle Scholar
  5. Department of Applied Ethnic Linguistics, Institute of Ethnology and Anthropology, Chinese Academy of Social Sciences. http://tibetan.vurls.cn/tool/segpos/.Google ScholarGoogle Scholar
  6. Zhao Weina. 2012. Ji Yu Fa Lv Wen Ben De Zang Yu Ju Zi Bian Jie Shi Bie (Legal texts based Tibetan sentence boundary recognition). Doctoral Dissertation. Beijing Language and Culture University.Google ScholarGoogle Scholar
  7. Zhao Weina, Yu Xin, Liu Huidan, Li Lin, Wang Lei, and Wu Jian. 2013. Xian Dai Zang Yu Zhu Dong Ci Jie Wei Ju Zi Bian Jie Shi Bie Fang Fa (Boundary recognition of Tibetan sentences with final of auxiliary verb) J. Chinese Inf. Proc. 01 (2013), 115–119.Google ScholarGoogle Scholar
  8. Li Xiang, Cai Zangtai, Jiang Wenbin, Lv Yajuan, and Liu Qun. 2011. Zui Da Shang He Gui Ze Xiang Jie He De Zang Wen Ju Zi Bian Jie Shi Bie Fang Fa (A maximum entropy and rules approach to identifying Tibetan sentence boundaries). J. Chinese Inf. Proc. 04 (2011), 39–44.Google ScholarGoogle Scholar
  9. Cai Zangtai. 2012. Ji Yu Zui Da Shang Fen Lei Qi De Zang Wen Ju Zi Bian Jie Shi Bie Fang Fa Yan Jiu (Research on the automatic identification of Tibetan sentence boundaries with maximum entropy classifier). Comput. Eng. Sci. 06 (2012), 187–190.Google ScholarGoogle Scholar
  10. Yu Xin, Wu Jian, and Hong Jinling. 2011. Ji Yu Ci Dian De Han Zang Ju Zi Dui Qi Yan Jiu Yu Shi Xian (Research on lexicon-based Chinese-Tibetan sentence alignment). J. Chinese Inf. Proc. 04 (2011), 57–62.Google ScholarGoogle Scholar
  11. Long Congjun. 2016. On Tibetan Syntax Tree Bank Tagging. Postdoctoral Dissertation. Chinese Academy of Sciences University.Google ScholarGoogle Scholar
  12. Huaque Cairang and Zhao Hai-xing. 2013. Tibetan text dependency syntactic analysis based on discriminant. Comput. Eng. 39, 4 (2013), 300–304.Google ScholarGoogle Scholar
  13. Tashi-Gyal and Duo-La. 2015. Zang Yu Yi Cun Shu Ku Gou Jian De Li Lun Yu Fang Fa Tan Xi (Theory and method of Tibetan dependency Treebank construction). J. Tibet Univ. (Nat. Sci. Ed.) 30, 2 (2015), 76–83.Google ScholarGoogle Scholar
  14. Jiang Di. 2003. Xian Dai Zang Yu Zu Kuai Fen Ci De Fang Fa He Guo Cheng (Contemporary Tibetan language chunk segmentation approach and practice). Minor. Lang. China 04 (2003), 30–39.Google ScholarGoogle Scholar
  15. Long Congjun. 2014. Zang Yu Yu Yi Jue Se Zi Dong Biao Zhu Yan Jiu (The study on annotation of Tibetan semantics role). Doctoral Dissertation. Central Minzu University.Google ScholarGoogle Scholar
  16. Wang Tianhang, Shi Shumin, Long Congjun, Huang Heyan, and Li Lin. 2014. Ji Yu Cuo Wu Qu Dong Xue Xi Ce Lue De Zang Yu Ju Fa Gong Neng Zhu Kuai Bian Jie Shi Bie (Tibetan chunking based on error-driven learning strategy). J. Chinese Inf. Proc. 28, 05 (2014), 170–175+191.Google ScholarGoogle Scholar
  17. Li Lin. 2014. Recognition and Tagging of Tibetan Syntactic Chunk Boundary. Doctoral Dissertation. Chinese Academy of Social Sciences.Google ScholarGoogle Scholar
  18. Huidan Liu, Congjun Long, Longlong Ma, and Jian Wu. 2018. A collection of Tibetan text corpora. In Proceedings of the LREC2018 Workshop “Belt & Road: Language Resources and Evaluation.”Google ScholarGoogle Scholar
  19. Kuanghua Chen and Hsin-Hsi Chen. 1994. Extracting noun phrases from large-scale texts: A hybrid approach and its automatic evaluation. In Proceedings of the 32nd ACL Annual Meeting. 234–241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Zhou Qiang, Sun Maosong, and Huang Changning. 2000. Han Yu Zui Chang Ming Ci Duan Yu De Zi Dong Shi Bie (Auto recognition of Chinese maximum noun phrase). J. Softw. 11, 2 (2000), 195–201.Google ScholarGoogle Scholar
  21. Qian Xiaofei and Hou Min. 2017. Mian Xiang Xin Xi Chu Li De Han Yu Zui Chang Ming Ci Duan Yu Jie Ding Yan Jiu (Research on Chinese maximal noun phrase classification for information processing). Appl. Ling. 2 (2017), 127–134.Google ScholarGoogle Scholar
  22. Philipp Koehn and Kevin Knight. 2003. Feature-rich statistical translation of noun phrases. In Proceedings of the 41st Meeting of the Association, for Computational Linguistics. 311–318. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Recognition of Tibetan Maximal-length Noun Phrases Based on Syntax Tree

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 2
      March 2021
      313 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3454116
      Issue’s Table of Contents

      Copyright © 2021 Association for Computing Machinery.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 March 2021
      • Revised: 1 September 2020
      • Accepted: 1 September 2020
      • Received: 1 January 2020
      Published in tallip Volume 20, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)11
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!