Abstract
We propose a ranking framework, called PatternRank+NN, for expanding a set of seed entities of a particular class (i.e., entity set expansion) from Web search queries. PatternRank+NN consists of two parts: PatternRank and NN. Unlike the traditional methods, PatternRank brings user behaviors into entity set expansion from Web search queries. PatternRank is a Markov chain which simulates the Web search query process of users on the graph model for Web search query log, and ranks the features of the class. The features in the front rank are used to generate candidate entities of the class. NN, a ranking strategy called Nearest Neighbor, ranks these candidate entities such that the set of seed entities can be expanded from the candidate entities in the front rank. Our experiments demonstrate the superior performance of PatternRank+NN in comparison with the state-of-the-art methods.
- Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen Liao, Enhong Chen, and Hang Li. 2008. Context-aware query suggestion by mining click-through and session data. In Proceedings of KDD’08. 875--883.Google Scholar
Digital Library
- Zhe Chen, Michael Cafarella, and H. V. Jagadish. 2016. Long-tail vocabulary dictionary extraction from the web. In Proceedings of WSDM’16. 625--634.Google Scholar
- Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press.Google Scholar
- William W. Cohen and Sunita Sarawagi. 2004. Exploiting dictionaries in named entity extraction: Combining semi-Markov extraction processes and data integration methods. In Proceedings of KDD’04. 89--98.Google Scholar
- Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of OSDI’04. 137--149.Google Scholar
Digital Library
- Charles Elkan and Keith Noto. 2008. Learning classifiers from only positive and unlabeled data. In Proceedings of KDD’08. 213--220.Google Scholar
Digital Library
- Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. 2005. Unsupervised named-entity extraction from the Web: An experimental study. Artificial Intelligence 165, 1 (2005), 91--134.Google Scholar
Digital Library
- Sonal Gupta and Christopher Manning. 2014. Improved pattern learning for bootstrapped entity extraction. In Proceedings of the 18th Conference on Computational Natural Language Learning. 98--108.Google Scholar
Cross Ref
- Yeye He and Dong Xin. 2011. Seisa: Set expansion by iterative similarity aggregation. In Proceedings of WWW’11. 427--436.Google Scholar
Digital Library
- Jian Hu, Gang Wang, Fred Lochovsky, Jian-tao Sun, and Zheng Chen. 2009. Understanding user’s query intent with wikipedia. In Proceedings of WWW’09. 471--480.Google Scholar
Digital Library
- Jenny Rose Finkel, Trond Grenager, and Christopher Manning. 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of ACL’05. 363--370.Google Scholar
Digital Library
- Jon M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46, 5 (1999), 604--632.Google Scholar
Digital Library
- Lillian Lee. 1999. Measures of distributional similarity. In Proceedings of ACL’99. 25--32.Google Scholar
Digital Library
- Wee Sun Lee and Bing Liu. 2003. Learning with positive and unlabeled examples using weighted logistic regression. In Proceedings of ICML’03. 448--455.Google Scholar
- Xiaoli Li and Bing Liu. 2003. Learning to classify texts using positive and unlabeled data. In Proceedings of IJCAI’03. 587--592.Google Scholar
- Xiao-Li Li, Lei Zhang, Bing Liu, and See-Kiong Ng. 2010. Distributional similarity vs. PU learning for entity set expansion. In Proceedings of ACL’10. 359--364.Google Scholar
- Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999--0120. Department of Computer Science, Stanford University.Google Scholar
- Patrick Pantel, Eric Crestan, Arkady Borkovsky, Ana-Maria Popescu, and Vishnu Vyas. 2009. Web-scale distributional similarity and entity set expansion. In Proceedings of EMNLP’09. 938--947.Google Scholar
Cross Ref
- Marius Pasca. 2007. Weakly-supervised discovery of named entities using web search queries. In Proceedings of CIKM’07. 683--690.Google Scholar
Digital Library
- Marius Pasca and Benjamin Van Durme. 2007. What you seek is what you get: Extraction of class attributes from query logs. In Proceedings of IJCAI’07. 2832--2837.Google Scholar
- G. Pass, A. Chowdhury, and C. Torgeson. 2006. A picture of search. In INFOSCALE (paper 1).Google Scholar
- Xin Rong, Zhe Chen, Qiaozhu Mei, and Eytan Adar. 2016. Egoset: Exploiting word ego-networks and user-generated ontology for multifaceted set expansion. In Proceedings of WSDM’16. 645--654.Google Scholar
Digital Library
- Jiaming Shen, Zeqiu Wu, Dongming Lei, Jingbo Shang, Xiang Ren, and Jiawei Han. 2017. Setexpan: Corpus-based set expansion via context feature selection and rank ensemble. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 288--304.Google Scholar
Cross Ref
- Bei Shi, Zhenzhong Zhang, Le Sun, and Xianpei Han. 2014. A probabilistic co-bootstrapping method for entity set expansion. In Proceedings of COLING’14. 2280--2290.Google Scholar
- Shuming Shi, Huibin Zhang, Xiaojie Yuan, and Ji Rong Wen. 2010. Corpus-based semantic class mining: Distributional vs. Pattern-based approaches. In Proceedings of COLING’10.Google Scholar
- Wei Song, Shiqi Zhao, Chao Zhang, Hua Wu, Haifeng Wang, Lizhen Liu, and Hanshi Wang. 2015. Exploiting collective hidden structures in webpage titles for open domain entity extraction. In Proceedings of WWW’15. 1014--1024.Google Scholar
Digital Library
- Partha Pratim Talukdar, Joseph Reisinger, Marius Pasca, Deepak Ravichandran, Rahul Bhagat, and Fernando Pereira. 2008. Weakly-supervised acquisition of labeled class instances using graph random walks. In Proceedings of EMNLP’08. 582--590.Google Scholar
Cross Ref
- Fangbo Tao, Bo Zhao, Ariel Fuxman, Yang Li, and Jiawei Han. 2015. Leveraging pattern semantics for extracting entities in enterprises. In Proceedings of WWW’15. 1078--1088.Google Scholar
Digital Library
- Chi Wang, Kaushik Chakrabarti, Yeye He, Kris Ganjam, Zhimin Chen, and Philip A. Bernstein. 2015. Concept expansion using web tables. In Proceedings of WWW’15. 1198--1208.Google Scholar
- Richard C. Wang and William W. Cohen. 2008. Iterative set expansion of named entities using the web. In Proceedings of ICDM’08. 1091--1096.Google Scholar
- Gu Xu, Shuang-Hong Yang, and Hang Li. 2009. Named entity mining from click-through data using weakly supervised latent Dirichlet allocation. In Proceedings of KDD’09. 1365--1374.Google Scholar
Digital Library
- Zhenzhong Zhang, Le Sun, and Xianpei Han. 2016. A joint model for entity set expansion and attribute extraction from web search queries. In Proceedings of AAAI’16. 3101--3107.Google Scholar
- Yuyan Zheng, Chuan Shi, Xiaohuan Cao, Xiaoli Li, and Bin Wu. 2017. Entity set expansion with meta path in knowledge graph. In Proceedings of PAKDD’17. 317--329.Google Scholar
Cross Ref
Index Terms
PatternRank+NN: A Ranking Framework Bringing User Behaviors into Entity Set Expansion from Web Search Queries
Recommendations
Ranked Reverse Nearest Neighbor Search
Given a set of data points P and a query point q in a multidimensional space, Reverse Nearest Neighbor (RNN) query finds data points in P whose nearest neighbors are q. Reverse k-Nearest Neighbor (RkNN) query (where k ≥ 1) generalizes RNN query to find ...
On kernel difference-weighted k-nearest neighbor classification
Special Issue: Non-parametric distance-based classification techniques and their applicationsNearest neighbor (NN) rule is one of the simplest and the most important methods in pattern recognition. In this paper, we propose a kernel difference-weighted k-nearest neighbor (KDF-KNN) method for pattern classification. The proposed method defines ...
Entity Set Expansion via Knowledge Graphs
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalThe entity set expansion problem is to expand a small set of seed entities to a more complete set of similar entities. It can be applied in applications such as web search, item recommendation and query expansion. Traditionally, people solve this ...






Comments