Abstract
Unsupervised dependency parsing becomes more and more popular in recent years because it does not need expensive annotations, such as treebanks, which are required for supervised and semi-supervised dependency parsing. However, its accuracy is still far below that of supervised dependency parsers, partly due to the fact that their parsing model is insufficient to capture linguistic phenomena underlying texts. The performance for unsupervised dependency parsing can be improved by mining knowledge from the texts and by incorporating it into the model. In this article, syntactic knowledge is acquired from query logs to help estimate better probabilities in dependency models with valence. The proposed method is language independent and obtains an improvement of 4.1% unlabeled accuracy on the Penn Chinese Treebank by utilizing additional dependency relations from the Sogou query logs and Baidu query logs. Morever, experiments show that the proposed model achieves improvements of 8.07% on CoNLL 2007 English using the AOL query logs. We believe query logs are useful sources of syntactic knowledge for many natural language processing (NLP) tasks.
- Cory Barr, Rosie Jones, and Moira Regelson. 2008. The linguistic structure of english web-search queries. In Proceedings of EMNLP 2008. Association for Computational Linguistics, 1021--1030. Google Scholar
Digital Library
- Wenliang Chen, Min Zhang, and Yue Zhang. 2013. Semi-supervised feature transformation for dependency parsing. In EMNLP 2013. Association for Computational Linguistics, Seattle, WA, 1303--1313.Google Scholar
- Y. J. Chu and T. H. Liu. 1965. On the shortest arborescence of a directed graph. Sci. Sinica 14 (1965), 1396--1400.Google Scholar
- Kenneth Ward Church and Patrick Hanks. 1990. Word association norms, mutual information, and lexicography. Comput. Linguist. 16, 1 (March 1990), 22--29. Google Scholar
Digital Library
- Shay B. Cohen, Dipanjan Das, and Noah A. Smith. 2011. Unsupervised structure prediction with non-parallel multilingual guidance. In Proceedings of the EMNLP 2011. Association for Computational Linguistics, Edinburgh, Scotland, UK, 50--61. Google Scholar
Digital Library
- Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan, and Tat-Seng Chua. 2005. Question answering passage retrieval using dependency relations. In Proceedings of SIGIR 2005. ACM, New York, NY, 400--407. Google Scholar
Digital Library
- Aron Culotta and Jeffrey Sorensen. 2004. Dependency tree kernels for relation extraction. In Proceedings of ACL’04. Barcelona, Spain, 423--429. Google Scholar
Digital Library
- William P. Headden, III. 2012. Unsupervised Bayesian Lexicalized Dependency Grammar Induction. Ph.D. Dissertation. Brown University.Google Scholar
- William P. Headden, III, Mark Johnson, and David McClosky. 2009. Improving unsupervised dependency parsing with richer contexts and smoothing. In Proceedings of NAACL 2009. Association for Computational Linguistics, Boulder, CO, 101--109. Google Scholar
Digital Library
- Dan Klein and Christopher D. Manning. 2004. Corpus-based induction of syntactic structure: Models of dependency and constituency. In Proceedings of ACL’04. Association for Computational Linguistics, Article 478. Google Scholar
Digital Library
- Terry Koo, Xavier Carreras, and Michael Collins. 2008. Simple semi-supervised dependency parsing. In Proc. ACL/HLT.Google Scholar
- Xiao Li. 2010. Understanding the semantic structure of noun phrase queries. In Proceedings of ACL ’10. Association for Computational Linguistics, 1337--1345. Google Scholar
Digital Library
- Zhenghua Li, Min Zhang, and Wenliang Chen. 2014. Ambiguity-aware ensemble training for semi-supervised dependency parsing. In Proceedings of the 52nd Annual Meeting of the ACL. Association for Computational Linguistics, Baltimore, MD, 457--467.Google Scholar
Cross Ref
- Kai Liu, Yajuan Lü, Wenbin Jiang, and Qun Liu. 2013. Bilingually-guided monolingual dependency grammar induction. In Proceedings of ACL 2013. Association for Computational Linguistics, Sofia, Bulgaria, 1063--1072.Google Scholar
- Yiqun Liu, Junwei Miao, Min Zhang, Shaoping Ma, and Liyun Ru. 2011. How do users describe their information need: Query recommendation based on snippet click model. Expert Syst. Appl. 38, 11 (2011), 13847--13856. DOI:http://dx.doi.org/10.1016/j.eswa.2011.04.188Google Scholar
- Xuezhe Ma and Fei Xia. 2014. Unsupervised dependency parsing with transferring distribution via parallel guidance and entropy regularization. In Proceedings of ACL 2014. Association for Computational Linguistics, Baltimore, MD, 1337--1348.Google Scholar
Cross Ref
- Martin Majliš and Zdeněk Žabokrtský. 2012. Language richness of the web. In Proceedings of LREC-2012. European Language Resources Association (ELRA), Istanbul, Turkey, 2927--2934. ACL Anthology Identifier: L12-1110.Google Scholar
- David Mareček and Milan Straka. 2013. Stop-probability estimates computed on a large corpus improve unsupervised dependency parsing. In Proceedings of ACL’13. Association for Computational Linguistics, Sofia, Bulgaria, 281--290.Google Scholar
- David Mareček and Zdeněk Žabokrtský. 2012a. Exploiting reducibility in unsupervised dependency parsing. In Proceedings of EMNLP-CoNLL’12. Association for Computational Linguistics, Jeju Island, Korea, 297--307. Google Scholar
Digital Library
- David Mareček and Zdeněk Žabokrtský. 2012b. Unsupervised dependency parsing using reducibility and fertility features. In Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure (WILS’12). Association for Computational Linguistics, Stroudsburg, PA, 84--89. Google Scholar
Digital Library
- R. McDonald and F. Pereira. 2006. Online learning of approximate dependency parsing algorithms. In 11th Conference of the European Chapter of the Association for Computational Linguistics: EACL 2006 (EACL’06).Google Scholar
- Tahira Naseem, Regina Barzilay, and Amir Globerson. 2012. Selective sharing for multilingual dependency parsing. In Proceedings of ACL’12. Association for Computational Linguistics, Jeju Island, Korea, 629--637. Google Scholar
Digital Library
- Joakim Nivre, Johan Hall, Jens Nilsson, Atanas Chanev, Gülsen Eryigit, Sandra Kübler, Svetoslav Marinov, and Erwin Marsi. 2007. MaltParser: A language-independent system for data-driven dependency parsing. Natur. Lang. Eng. 13, 2 (2007), 95--135.Google Scholar
Cross Ref
- Chris Quirk, Arul Menezes, and Colin Cherry. 2005. Dependency treelet translation: Syntactically informed phrasal SMT. In Proceedings of ACL 2005. Association for Computational Linguistics, Ann Arbor, MI, 271--279. Google Scholar
Digital Library
- Satoshi Sekine and Hisami Suzuki. 2007. Acquiring ontological knowledge from query logs. In Proceedings of WWW’07. ACM, New York, NY, 1223--1224. Google Scholar
Digital Library
- Libin Shen, Jinxi Xu, and Ralph Weischedel. 2010. String-to-dependency statistical machine translation. Comput. Linguist. 36, 4 (Dec. 2010), 649--671. DOI:http://dx.doi.org/10.1162/coli_a_00015 Google Scholar
Digital Library
- Anders Søgaard. 2011. Data point selection for cross-language adaptation of dependency parsers. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, OR, 682--686. Google Scholar
Digital Library
- Valentin I. Spitkovsky, Hiyan Alshawi, Angel X. Chang, and Daniel Jurafsky. 2011. Unsupervised dependency parsing without gold part-of-speech tags. In Proceedings of EMNLP 2011. Google Scholar
Digital Library
- Valentin I. Spitkovsky, Hiyan Alshawi, and Daniel Jurafsky. 2010. From baby steps to leapfrog: How “less is more” in unsupervised dependency parsing. In Proc. of NAACL-HLT. Google Scholar
Digital Library
- Valentin I. Spitkovsky, Hiyan Alshawi, and Daniel Jurafsky. 2011. Punctuation: Making a point in unsupervised dependency parsing. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning (CoNLL-2011). Google Scholar
Digital Library
- Valentin I. Spitkovsky, Hiyan Alshawi, and Daniel Jurafsky. 2012. Three dependency-and-boundary models for grammar induction. In Proceedings of the EMNLP-CoNLL 2012. Google Scholar
Digital Library
- Valentin I. Spitkovsky, Hiyan Alshawi, Daniel Jurafsky, and Christopher D. Manning. 2010a. Viterbi training improves unsupervised dependency parsing. In Proceedings of CoNLL-2010. Google Scholar
Digital Library
- Valentin I. Spitkovsky, Daniel Jurafsky, and Hiyan Alshawi. 2010b. Profiting from mark-up: Hyper-text annotations for guided parsing. In Proceedings of ACL 2010. Association for Computational Linguistics, Uppsala, Sweden, 1278--1287. Google Scholar
Digital Library
- Wolfgang Tannebaum and Andreas Rauber. 2012. Acquiring lexical knowledge from query logs for query expansion in patent searching. In Proceedings of ICSC’12. IEEE Computer Society, Washington, DC, 336--338. Google Scholar
Digital Library
- Gokhan Tur, Dilek Hakkani-Tur, Dustin Hillard, and Asli Celikyilmaz. 2011. Towards unsupervised spoken language understanding: Exploiting query click logs for slot filling. Annual Conference of the International Speech Communication Association (Interspeech).Google Scholar
- Mengqiu Wang, Noah A. Smith, and Teruko Mitamura. 2007. What is the jeopardy model? A quasi-synchronous grammar for QA. In Proceedings of the 2007 Joint Conference on EMNLP-CoNLL. Association for Computational Linguistics, Prague, Czech Republic, 22--32.Google Scholar
- Mo Yu, Tiejun Zhao, and Yalong Bai. 2013. Learning domain differences automatically for dependency parsing adaptation. In IJCAI, Francesca Rossi (Ed.). IJCAI/AAAI. Google Scholar
Digital Library
Index Terms
Improving Unsupervised Dependency Parsing with Knowledge from Query Logs
Recommendations
Improving Telugu Dependency Parsing using Combinatory Categorial Grammar Supertags
We show that Combinatory Categorial Grammar (CCG) supertags can improve Telugu dependency parsing. In this process, we first extract a CCG lexicon from the dependency treebank. Using both the CCG lexicon and the dependency treebank, we create a CCG ...
Improve Chinese Semantic Dependency Parsing via Syntactic Dependency Parsing
IALP '12: Proceedings of the 2012 International Conference on Asian Language ProcessingWe address the problem of Chinese semantic dependency parsing. Dependency parsing is traditionally oriented to syntax analysis, which we denote by syntactic dependency parsing to distinguish it from semantic dependency parsing. In this paper, firstly we ...
Telugu dependency parsing using different statistical parsers
In this paper we explore different statistical dependency parsers for parsing Telugu. We consider five popular dependency parsers namely, MaltParser, MSTParser, TurboParser, ZPar and Easy-First Parser. We experiment with different parser and feature ...






Comments