skip to main content
short-paper

Improving Telugu Dependency Parsing using Combinatory Categorial Grammar Supertags

Published:30 January 2015Publication History
Skip Abstract Section

Abstract

We show that Combinatory Categorial Grammar (CCG) supertags can improve Telugu dependency parsing. In this process, we first extract a CCG lexicon from the dependency treebank. Using both the CCG lexicon and the dependency treebank, we create a CCG treebank using a chart parser. Exploring different morphological features of Telugu, we develop a supertagger using maximum entropy models. We provide CCG supertags as features to the Telugu dependency parser (MST parser). We get an improvement of 1.8% in the unlabelled attachment score and 2.2% in the labelled attachment score. Our results show that CCG supertags improve the MST parser, especially on verbal arguments for which it has weak rates of recovery.

References

  1. Bharat Ram Ambati, Tejaswini Deoskar, and Mark Steedman. 2013. Using ccg categories to improve hindi dependency parsing. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics-Short Papers (ACLShortPapers’13). 604--609.Google ScholarGoogle Scholar
  2. Bharat Ram Ambati, Samar Husain, Sambhav Jain, Dipti Misra Sharma, and Rajeev Sangal. 2010. Two methods to incorporate local morphosyntactic features in hindi dependency parsing. In Proceedings of the 1st NAACL HLT Workshop on Statistical Parsing of Morphologically-Rich Languages (SPMRL’10). 22--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Akshar Bharati, Vineet Chaitanya, and Rajeev Sangal. 1995. Natural Language Processing: A Paninian Perspective. Prentice-Hall of India, 65--106.Google ScholarGoogle Scholar
  4. Akshar Bharati, Rajeev Sangal, and Dipti Misra Sharma. 2007. SSF: Shakti standard format guide. Tech. rep. TR-LTRC-33, LTRC, IIIT-Hyderabad.Google ScholarGoogle Scholar
  5. Akshar Bharati, Rajeev Sangal, Dipti Misra Sharma, and Lakshmi Bai. 2006. AnnCorra: Annotating corpora guidelines for pos and chunk annotation for indian languages. Tech. rep. TR-LTRC-31, LTRC, IIIT-Hyderabad.Google ScholarGoogle Scholar
  6. Akshar Bharati, Dipti Misra Sharma, Samar Husain, Lakshmi Bai, Rafiya Begum, and Rajeev Sangal. 2009. AnnCorra: TreeBanks for indian languages, guidelines for annotating hindi treebank (version 2.0). http://ltrc.iiit.ac.in/MachineTrans/research/tb/DS-guidelines/DS-guidelines-ver2-28-05-09.pdf.Google ScholarGoogle Scholar
  7. Johan Bos, Cristina Bosco, and Alessandro Mazzei. 2009. Converting a dependency treebank to a categorial grammar treebank for italian. In Proceedings of the 8th International Workshop on Treebanks and Linguistic Theories (TLT’09). M. Passarotti, Adam Przepiorkowski, S. Raynaud, and Frank Van Eynde, Eds., 27--38.Google ScholarGoogle Scholar
  8. Sabine Buchholz and Erwin Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL’06). 149--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ruken Cakici. 2005. Automatic induction of a ccg grammar for turkish. In Proceedings of the ACL Student Research Workshop (ACLstudent’05). 73--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ruket Cakici. 2009. Parser models for a highly inflected language. Ph.D. dissertation, University of Edinburgh, UK.Google ScholarGoogle Scholar
  11. Stephen Clark and James R. Curran. 2004. The importance of supertagging for wide-coverage ccg parsing. In Proceedings of the 20th International Conference on Computational Linguistics (COLING’04). 282--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Stephen Clark and James R. Curran. 2007. Wide-coverage efficient statistical parsing with ccg and log-linear models. Comput. Linguist. 33, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Julia Hockenmaier. 2006. Creating a ccgbank and a wide-coverage ccg lexicon for german. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (ACL’06). 505--512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Julia Hockenmaier and Mark Steedman. 2007. CCGbank: A corpus of ccg derivations and dependency structures extracted from the penn treebank. Comput. Linguist. 33, 3, 355--396. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Samar Husain. 2009. Dependency parsers for indian languages. In Proceedings of the NLP Tools Contest: Indian Language Dependency Parsing (ICON’09).Google ScholarGoogle Scholar
  16. Samar Husain, Prashanth Mannem, Bharat Ram Ambati, and Phani Gadde. 2010. The icon-2010 tools contest on indian language dependency parsing. In Proceedings of the International Conference on Natural Language Processing (ICON’10).Google ScholarGoogle Scholar
  17. Sruthilaya Reddy Kesidi, Prudhvi Kosaraju, Meher Vijay, and Samar Husain. 2010. A two stage constraint based hybrid dependency parser for telugu. In Proceedings of the International Conference on Natural Language Processing (ICON’10).Google ScholarGoogle Scholar
  18. Sunghwan Mac Kim, Dominick Ng, Mark Johnson, and James Curran. 2012. Improving combinatory categorial grammar parse reranking with dependency grammar features. In Proceedings of the International Conference on Computational Linguistics (COLING’12). 1441--1458.Google ScholarGoogle Scholar
  19. Ryan McDonald. 2006. Discriminative learning and spanning tree algorithms for dependency parsing. http://www.cis.upenn.edu/grad/documents/mcdonald.pdf.Google ScholarGoogle Scholar
  20. Ryan Mcdonald, Koby Crammer, and Fernando Pereira. 2005. Online large-margin training of dependency parsers. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL’05). 91--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ryan Mcdonald and Joakim Nivre. 2007. Characterizing the errors of data-driven dependency parsing models. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 121--131.Google ScholarGoogle Scholar
  22. Joakim Nivre, Johan Hall, Sandra Kubler, Ryan McDonald, Jens Nilsson, Sebastian Riedel, and Deniz Yuret. 2007a. The conll 2007 shared task on dependency parsing. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL. 915--932.Google ScholarGoogle Scholar
  23. Joakim Nivre, Johan Hall, Jens Nilsson, Atanas Chanev, Gulsen Eryigit, Sandra Kubler, Svetoslav Marinov, and Erwin Marsi. 2007b. MaltParser: A language-independent system for data-driven dependency parsing. Natural Lang. Engin. 13, 2, 95--135.Google ScholarGoogle ScholarCross RefCross Ref
  24. Mark Steedman. 2000. The Syntactic Process. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Daniel Tse and James R. Curran. 2010. Chinese ccgbank: Extracting ccg derivations from the penn chinese treebank. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10). 1083--1091. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Hao Zhang, Liang Huang, Kai Zhao, and Ryan Mcdonald. 2013. Online learning for inexact hypergraph search. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’13).Google ScholarGoogle Scholar
  27. Yue Zhang and Stephen Clark. 2011. Shift-reduce ccg parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT’11). 683--692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Yue Zhang and Joakim Nivre. 2011. Transition-based dependency parsing with rich non-local features. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT’11). 188--193. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improving Telugu Dependency Parsing using Combinatory Categorial Grammar Supertags

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!