skip to main content
research-article

Converting Dependency Structure Into Persian Phrase Structure

Published:07 May 2019Publication History
Skip Abstract Section

Abstract

Treebank is one of the important and useful resources in natural language processing represented in two different annotated schemas: phrase and dependency structures. There are many works that convert a phrase structure into a dependency structure and vice versa. Most of them are based that exploit the handcrafted head percolation table and argument table in predefined deterministic ways. In this article, we propose a method to convert a dependency structure into a phrase structure by enriching a trainable model of former hybrid strategy approach. By adding a classifier to the algorithm and using postprocessing modification, the quality of conversion is increased. We evaluate our method in two different languages, English and Persian, and then analyze the errors. The results of our experiments show a 46.01% reduction of error rate in English and 76.50% for Persian compared to our baseline. We build a new phrase structure treebank by converting 10,000 sentences of Persian dependency treebank into corresponding phrase structures and correcting them manually.

References

  1. Bharat Ram Ambati, Tejaswini Deoskar, and Mark Steedman. 2016. Hindi CCGbank: A CCG treebank from the Hindi dependency treebank. Language Resources and Evaluation 52, 1 (2016), 67--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Tania Avgustinova and Yi Zhang. 2010. Conversion of a Russian dependency treebank into HPSG derivations. In Proceedings of the 9th International Workshop on Treebanks and Linguistic Theories. 7.Google ScholarGoogle Scholar
  3. Rajesh Bhatt, Bhuvana Narasimhan, Martha Palmer, Owen Rambow, Dipti Misra Sharma, and Fei Xia. 2009. A multi-representational and multi-layered treebank for Hindi/Urdu. In Proceedings of the 3rd Linguistic Annotation Workshop. 186--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Rajesh Bhatt, Owen Rambow, and Fei Xia. 2011. Linguistic phenomena, analyses, and representations: Understanding conversion between treebanks. In Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP’11). 1234--1242.Google ScholarGoogle Scholar
  5. Rajesh Bhatt, Owen Rambow, and Fei Xia. 2012. Creating a tree adjoining grammar from a multilayer treebank. In Proceedings of the 11th International Workshop on Tree Adjoining Grammars and Related Formalisms. 162--170.Google ScholarGoogle Scholar
  6. Rajesh Bhatt and Fei Xia. 2012. Challenges in converting between treebanks: A case study from the HUTB. In Proceedings of META-RESEARCH Workshop on Advanced Treebanking in Conjunction With LREC-2012.Google ScholarGoogle Scholar
  7. Alena Böhmová, Jan Hajič, Eva Hajičová, and Barbora Hladká. 2003. The Prague dependency treebank. In Treebanks. Springer, 103--127.Google ScholarGoogle Scholar
  8. Johan Bos, Cristina Bosco, and Alessandro Mazzei. 2009. Converting a dependency treebank to a categorial grammar treebank for Italian. In Proceedings of the 8th International Workshop on Treebanks and Linguistic Theories (TLT’09). 27--38.Google ScholarGoogle Scholar
  9. Aoife Cahill, Mairead McCarthy, Josef Van Genabith, and Andy Way. 2002. Automatic annotation of the Penn-treebank with LFG F-structure information. In Proceedings of the LREC 2002 Workshop on Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Language Data.Google ScholarGoogle Scholar
  10. Ruken Cakici. 2005. Automatic induction of a CCG grammar for Turkish. In Proceedings of the ACL Student Research Workshop. 73--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Marie Candito, Joakim Nivre, Pascal Denis, and Enrique Henestroza Anguiano. 2010. Benchmarking of statistical dependency parsers for French. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters. 108--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Atanas Chanev, Kiril Simov, Petya Osenova, and Svetoslav Marinov. 2006. Dependency conversion and parsing of the BulTreeBank. In Proceedings of the LREC Workshop Merging and Layering Linguistic Information.Google ScholarGoogle Scholar
  13. Michael Collins, Lance Ramshaw, Jan Hajič, and Christoph Tillmann. 1999. A statistical parser for Czech. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics. 505--512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Michael A. Covington. 1994. An empirically motivated reinterpretation of dependency grammar. arXiv:cmp-lg/9404004.Google ScholarGoogle Scholar
  15. Bart Cramer and Yi Zhang. 2009. Construction of a German HPSG grammar from a detailed treebank. In Proceedings of the 2009 Workshop on Grammar Engineering Across Frameworks. 37--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Mark Dras, David Chiang, and William Schuler. 2004. On relations of constituency and dependency grammars. Research on Language and Computation 2, 2 (2004), 281--305.Google ScholarGoogle ScholarCross RefCross Ref
  17. Masood Ghayoomi. 2012. Bootstrapping the development of an HPSG-based treebank for Persian. Linguistic Issues in Language Technology 7, 1 (2012), 1--13.Google ScholarGoogle Scholar
  18. Masood Ghayoomi. 2012. Word clustering for Persian statistical parsing. In Advances in Natural Language Processing. Springer, 126--137.Google ScholarGoogle Scholar
  19. Masood Ghayoomi and Jonas Kuhn. 2014. Converting an HPSG-based treebank into its parallel dependency treebank. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14).Google ScholarGoogle Scholar
  20. Pawan Goyal and Amba Kulkarni. 2014. Converting phrase structures to dependency structures in Sanskrit. In Proceedings of COLING 2014: The 25th International Conference on Computational Linguistics. 1834--1843.Google ScholarGoogle Scholar
  21. Julia Hockenmaier. 2001. Statistical parsing for CCG with simple generative models. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (Companion Volume). 7--12.Google ScholarGoogle Scholar
  22. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics—Volume 1. 423--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Lingpeng Kong, Alexander M. Rush, and Noah A. Smith. 2015. Transforming dependencies into phrase structures. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics.Google ScholarGoogle Scholar
  24. Young-Suk Lee and Zhiguo Wang. 2016. Language independent dependency to constituent tree conversion. In Proceedings of COLING 2016: The 26th International Conference on Computational Linguistics: Technical Papers. 421--428.Google ScholarGoogle Scholar
  25. Alex Luu, Sophia A. Malamud, and Nianwen Xue. 2016. Converting SynTagRus dependency treebank into Penn treebank style. In Proceedings of the 10th Linguistic Annotation Workshop. 16--21.Google ScholarGoogle ScholarCross RefCross Ref
  26. Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. 1993. Building a large annotated corpus of English: The Penn treebank. Computational Linguistics 19, 2 (1993), 313--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yusuke Miyao, Takashi Ninomiya, and Junichi Tsujii. 2004. Corpus-oriented grammar development for acquiring a head-driven phrase structure grammar from the Penn treebank. In Proceedings of the International Conference on Natural Language Processing. 684--693. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Joakim Nivre. 2006. Inductive Dependency Parsing. Springer.Google ScholarGoogle Scholar
  29. Martha Palmer, Rajesh Bhatt, Bhuvana Narasimhan, Owen Rambow, Dipti Misra Sharma, and Fei Xia. 2009. Hindi syntax: Annotating dependency, lexical predicate-argument structure, and phrase structure. In Proceedings of the 7th International Conference on Natural Language Processing. 14--17.Google ScholarGoogle Scholar
  30. Likun Qiu, Yue Zhang, Peng Jin, and Houfeng Wang. 2014. Multi-view Chinese treebanking. In Proceedings of COLING 2014: The 25th International Conference on Computational Linguistics. 257--268.Google ScholarGoogle Scholar
  31. Mohammad Sadegh Rasooli, Manouchehr Kouhestani, and Amirsaeid Moloodi. 2013. Development of a Persian syntactic dependency treebank. In Proceedings of the 2013 Conference of th North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 306--314.Google ScholarGoogle Scholar
  32. Mohammad Sadegh Rasooli, Amirsaeid Moloodi, Manouchehr Kouhestani, and Behrouz Minaei-Bidgoli. 2011. A syntactic valency lexicon for Persian verbs: The first steps towards Persian dependency treebank. In Proceedings of the 5th Language and Technology Conference (LTC’11): Human Language Technologies as a Challenge for Computer Science and Linguistics. 227--231.Google ScholarGoogle Scholar
  33. Siva Reddy, Oscar Täckström, Michael Collins, Tom Kwiatkowski, Dipanjan Das, Mark Steedman, and Mirella Lapata. 2016. Transforming dependency structures to logical forms for semantic parsing. Transactions of the Association for Computational Linguistics 4 (2016), 127--140.Google ScholarGoogle ScholarCross RefCross Ref
  34. Yuka Tateisi, Kentaro Torisawa, Yusuke Miyao, and Junichi Tsujii. 1998. Translating the XTAG English grammar to HPSG. In Proceedings of the 4th International Workshop on Tree Adjoining Grammars and Related Frameworks, Vol. 4. 172--175.Google ScholarGoogle Scholar
  35. Lamia Tounsi, Mohammed Attia, and Josef van Genabith. 2009. Automatic treebank-based acquisition of Arabic LFG dependency structures. In Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages. 45--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Fei Xia and Martha Palmer. 2001. Converting dependency structures to phrase structures. In Proceedings of the 1st International Conference on Human Language Technology Research. 1--5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Fei Xia, Owen Rambow, Rajesh Bhatt, Martha Palmer, and Dipti Misra Sharma. 2009. Towards a multi-representational treebank. In Proceedings of the 7th International Workshop on Treebanks and Linguistic Theories.Google ScholarGoogle Scholar
  38. Naoki Yoshinaga and Yusuke Miyao. 2001. Grammar conversion from LTAG to HPSG. In Proceedings of the 6th ESSLLI Student Session. 309--324.Google ScholarGoogle Scholar
  39. Kun Yu, Yusuke Miyao, Xiangli Wang, Takuya Matsuzaki, and Junichi Tsujii. 2010. Semi-automatically developing Chinese HPSG grammar from the Penn Chinese treebank for deep parsing. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters. 1417--1425. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Converting Dependency Structure Into Persian Phrase Structure

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 18, Issue 3
      September 2019
      386 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3305347
      Issue’s Table of Contents

      Copyright © 2019 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 May 2019
      • Revised: 1 January 2019
      • Accepted: 1 January 2019
      • Received: 1 December 2017
      Published in tallip Volume 18, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)9
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!