skip to main content
short-paper

Chinese Syntax Parsing Based on Sliding Match of Semantic String

Authors Info & Claims
Published:25 July 2019Publication History
Skip Abstract Section

Abstract

Different from the current syntax parsing based on deep learning, we present a novel Chinese parsing method, which is based on Sliding Match of Semantic String (SMOSS). (1) Training stage: In a treebank, headwords of tree nodes are represented by semantic codes given in the Synonym Dictionary (Tongyici Cilin). N-gram semantic templates are extracted from every layer of a syntax tree by means of sliding window to establish one N-gram semantic template library. (2) Parsing stage: Words of a sentence, including headwords of chunks, are represented by the semantic codes from Tongyici Cilin. With the sliding window method, N-gram semantic code strings are extracted to match with the templates in the N-gram semantic template library; subsequently, the mapping information of the matched templates is employed to guide the chunking of semantic code strings. The Chinese syntax parsing is completed through continuous matching and chunking. On the same training scale, N-gram semantic template can create favorable conditions for flexible matching and improve the syntax parsing performance. With train and test sets from the Tsinghua Chinese Treebank (TCT), the results are F1-score 99.71% (closed test) and F1-score 70.43% (open test), respectively.

References

  1. Steven Abney. 1996. Partial parsing via finite state cascades. Nat. Lang. Eng. 2, 4 (1996), 337--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Chris Alberti, Daniel Andor, Ivan Bogatyy, Michael Collins, Dan Gillick, Lingpeng Kong, Terry Koo, Ji Ma, Mark Omernick, Slav Petrov, Chayut Thanapirom, Zora Tung, and David Weiss. 2017. SyntaxNet models for the CoNLL2017 shared task. arXiv:1703.04929, 1--6.Google ScholarGoogle Scholar
  3. Chris Alberti, David Weiss, Greg Coppola, and Slav Petrov. 2015. Improved transition-based parsing and tagging with neural networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1354--1359.Google ScholarGoogle ScholarCross RefCross Ref
  4. Miguel Ballesteros, Chris Dyer, and Noah A. Smith. 2015. Improved transition-based parsing by modeling characters instead of words with LSTMs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1--11.Google ScholarGoogle Scholar
  5. Jacob Buckman, Miguel Ballesteros, and Chris Dyer. 2016. Transition-based dependency parsing with heuristic backtracking. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2313--2318.Google ScholarGoogle ScholarCross RefCross Ref
  6. Danqi Chen and Christopher D. Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1--11.Google ScholarGoogle Scholar
  7. Wenliang Chen, Muhua Zhu, Min Zhang, Yue Zhang, and Jingbo Zhu. 2017. Improving shift-reduce phrase-structure parsing with constituent boundary information. Comput. Intell. 33, 3 (2017), 428--447.Google ScholarGoogle ScholarCross RefCross Ref
  8. Hao Cheng, Hao Fang, He xiao dong, Gao jian feng, and Li Deng. 2016. Bi-directional attention with agreement for dependency parsing. arXiv:1608.02076v2 {cs.CL} (2016), 1--11.Google ScholarGoogle Scholar
  9. Timothy Dozat and Christopher D. Manning. 2017. Deep biaffine attention for neural dependency paring. In Proceedings of the International Conference on Learning Representations (ICLR’17). 1--8.Google ScholarGoogle Scholar
  10. Greg Durrett and Dan Klein. 2015. Neural CRF parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing. 302--312.Google ScholarGoogle ScholarCross RefCross Ref
  11. Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, and Noah A. Smith. 2016. Recurrent neural network grammars. arXiv:1602.07776v4 (2016), 1--13.Google ScholarGoogle Scholar
  12. Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, and Noah A. Smith. 2015. Transition-based dependency parsing with stack long short-term memory. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. 334–343.Google ScholarGoogle Scholar
  13. Vanessa Wei Feng and Graeme Hirst. 2014. A linear-time bottom-up discourse parser with constraints and post-editing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 511--521.Google ScholarGoogle ScholarCross RefCross Ref
  14. Daniel Fernández-González and Carlos Gómez-Rodríguez. 2018. Faster shift-reduce constituent parsing with a non-binary, bottom-up strategy. arXiv:1804.07961v1 (2018), 1--10.Google ScholarGoogle Scholar
  15. Xianghao Geng, Junhui Li, Guodong Zhou, and Qiaoming Zhu. 2009. A history-based hierarchical chinese parsing. Comput. Appl. Softw. 26, 6 (2009), 45--47, 51.Google ScholarGoogle Scholar
  16. Cross James and Liang Huang. 2016. Incremental parsing with minimal features using bi-directional LSTM. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 32--37.Google ScholarGoogle Scholar
  17. Zhipeng Jiang, Yi Guan, and Xishuang Dong. 2014. A Chinese hierarchical parsing approach based on multi-layer collaborative correction. J. Chin. Inf. Process. 28, 4 (2014), 29--36.Google ScholarGoogle Scholar
  18. Zhipeng Jiang, Yu Zhao, Yi Guan, Chao Li, and Sheng Li. 2010. Complete syntactic analysis bases on multi-level chunking. In Proceedings of the Joint Conference on Chinese Language Processing (CIPS-SIGHAN’10). 1--5.Google ScholarGoogle Scholar
  19. Eliyahu Kiperwasser and Yoav Goldberg. 2016a. Easy-first dependency parsing with hierarchical tree LSTMs. Trans. Assoc. Comput. Ling. 4 (2016), 445--461.Google ScholarGoogle ScholarCross RefCross Ref
  20. Eliyahu Kiperwasser and Yoav Goldberg. 2016b. Simple and accurate dependency parsing using bidirectional LSTM feature representations. arXiv:1603.04351v3 {cs.CL} (2016), 1--15.Google ScholarGoogle Scholar
  21. Shuhei Kurita, Daisuke Kawahara, and Sadao Kurohashi. 2017. Neural joint model for transition-based chinese syntactic analysis. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 1204--1214.Google ScholarGoogle ScholarCross RefCross Ref
  22. Haonan Li, Zhisong Zhang, Yuqi Ju, and Hai Zhao. 2018. Neural character-level dependency parsing for Chinese. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18). 5205--5212.Google ScholarGoogle Scholar
  23. Jingyi Li, Lingling Mu, Hongying Zan, and Kunli Zhang. 2015. Research on Chinese parsing based on the improved compositional vector grammar. In Proceedings of the Chinese Lexical Semantics Workshop (CLSW’15), Lecture Notes in Artificial Intelligence, Vol. 9332, 649--658.Google ScholarGoogle ScholarCross RefCross Ref
  24. Zuchao Li, Jiaxun Cai, Shexia He, and Hai Zhao. 2018. Seq2seq dependency parsing. In Proceedings of the 27th International Conference on Computational Linguistics. 3203--3214.Google ScholarGoogle Scholar
  25. Hang Liu, Mingtong Liu, Yujie Zhang, Jinan Xu, and Yufeng Chen. 2018. improved character-based chinese dependency parsing by using stack-tree LSTM. In Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC’18), Lecture Notes in Artificial Intelligence, Vol. 11109, 203--212.Google ScholarGoogle ScholarCross RefCross Ref
  26. Jiangming Liu and Yue Zhang. 2017a. Shift-reduce constituent parsing with neural lookahead features. Trans. Assoc. Comput. Ling. 5 (2017), 45--58.Google ScholarGoogle ScholarCross RefCross Ref
  27. Jiangming Liu and Yue Zhang. 2017b. In-order transition-based constituent parsing. Trans. Assoc. Comput. Ling. 5 (2017), 413--424.Google ScholarGoogle ScholarCross RefCross Ref
  28. Yang Liu and Mirella Lapata. 2017. Learning contextually informed representations for linear-time discourse parsing. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1300--1309.Google ScholarGoogle ScholarCross RefCross Ref
  29. Haitao Mi and Liang Huang. 2015. Shift-reduce constituency parsing with dynamic programming and POS tag lattice. In Proceedings of the 2015 Annual Conference of the North American Chapter of the ACL. 1030--1035.Google ScholarGoogle ScholarCross RefCross Ref
  30. Slav Petrov and Dan Klein. 2007. Improved inference for unlexicalized parsing. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’07). 1--8.Google ScholarGoogle Scholar
  31. Yikang Shen, Zhouhan Lin, Athul Paul Jacob, Alessandro Sordoni, Aaron Courville, and Yoshua Bengio. 2018. Straight to the tree: Constituency parsing with neural syntactic distance. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 1171--1180.Google ScholarGoogle ScholarCross RefCross Ref
  32. Weiwei Sun, Yufei Chen, Xiaojun Wan, and Meichun Liu. 2018. Parsing Chinese sentences with grammatical relations. Comput. Ling. 45, 1 (2018), 95--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Zhiyang Teng and Yue Zhang. 2018. Two local models for neural constituent parsing. In Proceedings of the 27th International Conference on Computational Linguistics. 119--132.Google ScholarGoogle Scholar
  34. Yoshimasa Tsuruoka. 2009. Fast full parsing by linear-chain conditional random fields. In Proceedings of the 12th Conference of the European Chapter of the ACL. 790--798. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Wei Wang and Degen Huang. 2014. A syntactic parsing method based on sliding matching of semantic string. China patent: CN 103500160 A. 2014.Google ScholarGoogle Scholar
  36. Wenhui Wang and Baobao Chang. 2016. Graph-based dependency parsing with bidirectional LSTM. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2306--2315.Google ScholarGoogle ScholarCross RefCross Ref
  37. Zhiguo Wang and Nianwen Xue. 2014. Joint POS tagging and transition-based constituent parsing in Chinese with non-local features. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 733--742.Google ScholarGoogle ScholarCross RefCross Ref
  38. Zhiguo Wang, Haitao Mi, and Nianwen Xue. 2015. Feature optimization for constituent parsing via neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. 1138--1147.Google ScholarGoogle ScholarCross RefCross Ref
  39. Taro Watanabe and Eiichiro Sumita. 2015. Transition-based neural constituent parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 1169--1179.Google ScholarGoogle ScholarCross RefCross Ref
  40. Fuxiang Wu. 2017. Dependency parsing with transformed feature. Information 8, 13 (2017), 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  41. Meishan Zhang, Yue Zhang, Wanxiang Che, Ting Liu. 2014. Character-level chinese dependency parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 1326--1336.Google ScholarGoogle ScholarCross RefCross Ref
  42. Xingxing Zhang, Jianpeng Cheng, and Mirella Lapata. 2016. Dependency parsing as head selection. arXiv:1606.01280v4 {cs.CL} (2016), 1--12.Google ScholarGoogle Scholar
  43. Yuan Zhang, Chengtao Li, Regina Barzilay, and Kareem Darwish. 2015. Randomized greedy inference for joint segmentation, POS tagging and dependency parsing. In Proceedings of the 2015 Annual Conference of the North American Chapter of the ACL: Human Language Technologies. 42--52.Google ScholarGoogle ScholarCross RefCross Ref
  44. Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, and Enhong Chen. 2017. Stack-based multi-layer attention for transition-based dependency parsing. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1678--1683.Google ScholarGoogle ScholarCross RefCross Ref
  45. Zhisong Zhang, Hai Zhao, and Lianhui Qin. 2016. Probabilistic Graph-based Dependency Parsing with Convolutional Neural Network. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1382--1392.Google ScholarGoogle ScholarCross RefCross Ref
  46. Qiuye Zhao and Qun Liu. 2016. A subtree-based factorization of dependency parsing. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). 589--598.Google ScholarGoogle Scholar
  47. Lvexing Zheng, Houfeng Wang, and Xueqiang Lv. 2015. Improving Chinese dependency parsing with lexical semantic features. In Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC’15), Lecture Notes in Artificial Intelligence, Vol. 9362, 36--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Xiaoqing Zheng. 2017. Incremental graph-based neural dependency parsing. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1655--1665.Google ScholarGoogle ScholarCross RefCross Ref
  49. Hao Zhou, Shujian Huang, Junsheng Zhou, Yue Zhang, Huadong Chen, Xingyu Dai, Chuan Cheng, and Jiajun Chen. 2016. Shift-reduce constituent parsing with action N-gram model. ACM Trans. Asian Low-Resource Lang. Inf. Process. 15, 3 (2016), 1--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Qiaoli Zhou, Wenjing Lang, Yingyin Wang, Yan Wang, and Dongfeng Cai. 2010. The SAU Report for the first CIPS-SIGHAN-ParsEval-2010. In Proceedings of the Joint Conference on Chinese Language Processing (CIPS-SIGHAN’10). 1--8.Google ScholarGoogle Scholar

Index Terms

  1. Chinese Syntax Parsing Based on Sliding Match of Semantic String

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 1
        January 2020
        345 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3338846
        Issue’s Table of Contents

        Copyright © 2019 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 July 2019
        • Revised: 1 April 2019
        • Accepted: 1 April 2019
        • Received: 1 December 2017
        Published in tallip Volume 19, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!