skip to main content
research-article

PatternRank+NN: A Ranking Framework Bringing User Behaviors into Entity Set Expansion from Web Search Queries

Authors Info & Claims
Published:03 May 2020Publication History
Skip Abstract Section

Abstract

We propose a ranking framework, called PatternRank+NN, for expanding a set of seed entities of a particular class (i.e., entity set expansion) from Web search queries. PatternRank+NN consists of two parts: PatternRank and NN. Unlike the traditional methods, PatternRank brings user behaviors into entity set expansion from Web search queries. PatternRank is a Markov chain which simulates the Web search query process of users on the graph model for Web search query log, and ranks the features of the class. The features in the front rank are used to generate candidate entities of the class. NN, a ranking strategy called Nearest Neighbor, ranks these candidate entities such that the set of seed entities can be expanded from the candidate entities in the front rank. Our experiments demonstrate the superior performance of PatternRank+NN in comparison with the state-of-the-art methods.

References

  1. Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen Liao, Enhong Chen, and Hang Li. 2008. Context-aware query suggestion by mining click-through and session data. In Proceedings of KDD’08. 875--883.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Zhe Chen, Michael Cafarella, and H. V. Jagadish. 2016. Long-tail vocabulary dictionary extraction from the web. In Proceedings of WSDM’16. 625--634.Google ScholarGoogle Scholar
  3. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press.Google ScholarGoogle Scholar
  4. William W. Cohen and Sunita Sarawagi. 2004. Exploiting dictionaries in named entity extraction: Combining semi-Markov extraction processes and data integration methods. In Proceedings of KDD’04. 89--98.Google ScholarGoogle Scholar
  5. Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of OSDI’04. 137--149.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Charles Elkan and Keith Noto. 2008. Learning classifiers from only positive and unlabeled data. In Proceedings of KDD’08. 213--220.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. 2005. Unsupervised named-entity extraction from the Web: An experimental study. Artificial Intelligence 165, 1 (2005), 91--134.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sonal Gupta and Christopher Manning. 2014. Improved pattern learning for bootstrapped entity extraction. In Proceedings of the 18th Conference on Computational Natural Language Learning. 98--108.Google ScholarGoogle ScholarCross RefCross Ref
  9. Yeye He and Dong Xin. 2011. Seisa: Set expansion by iterative similarity aggregation. In Proceedings of WWW’11. 427--436.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jian Hu, Gang Wang, Fred Lochovsky, Jian-tao Sun, and Zheng Chen. 2009. Understanding user’s query intent with wikipedia. In Proceedings of WWW’09. 471--480.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jenny Rose Finkel, Trond Grenager, and Christopher Manning. 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of ACL’05. 363--370.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jon M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46, 5 (1999), 604--632.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Lillian Lee. 1999. Measures of distributional similarity. In Proceedings of ACL’99. 25--32.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Wee Sun Lee and Bing Liu. 2003. Learning with positive and unlabeled examples using weighted logistic regression. In Proceedings of ICML’03. 448--455.Google ScholarGoogle Scholar
  15. Xiaoli Li and Bing Liu. 2003. Learning to classify texts using positive and unlabeled data. In Proceedings of IJCAI’03. 587--592.Google ScholarGoogle Scholar
  16. Xiao-Li Li, Lei Zhang, Bing Liu, and See-Kiong Ng. 2010. Distributional similarity vs. PU learning for entity set expansion. In Proceedings of ACL’10. 359--364.Google ScholarGoogle Scholar
  17. Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999--0120. Department of Computer Science, Stanford University.Google ScholarGoogle Scholar
  18. Patrick Pantel, Eric Crestan, Arkady Borkovsky, Ana-Maria Popescu, and Vishnu Vyas. 2009. Web-scale distributional similarity and entity set expansion. In Proceedings of EMNLP’09. 938--947.Google ScholarGoogle ScholarCross RefCross Ref
  19. Marius Pasca. 2007. Weakly-supervised discovery of named entities using web search queries. In Proceedings of CIKM’07. 683--690.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Marius Pasca and Benjamin Van Durme. 2007. What you seek is what you get: Extraction of class attributes from query logs. In Proceedings of IJCAI’07. 2832--2837.Google ScholarGoogle Scholar
  21. G. Pass, A. Chowdhury, and C. Torgeson. 2006. A picture of search. In INFOSCALE (paper 1).Google ScholarGoogle Scholar
  22. Xin Rong, Zhe Chen, Qiaozhu Mei, and Eytan Adar. 2016. Egoset: Exploiting word ego-networks and user-generated ontology for multifaceted set expansion. In Proceedings of WSDM’16. 645--654.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jiaming Shen, Zeqiu Wu, Dongming Lei, Jingbo Shang, Xiang Ren, and Jiawei Han. 2017. Setexpan: Corpus-based set expansion via context feature selection and rank ensemble. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 288--304.Google ScholarGoogle ScholarCross RefCross Ref
  24. Bei Shi, Zhenzhong Zhang, Le Sun, and Xianpei Han. 2014. A probabilistic co-bootstrapping method for entity set expansion. In Proceedings of COLING’14. 2280--2290.Google ScholarGoogle Scholar
  25. Shuming Shi, Huibin Zhang, Xiaojie Yuan, and Ji Rong Wen. 2010. Corpus-based semantic class mining: Distributional vs. Pattern-based approaches. In Proceedings of COLING’10.Google ScholarGoogle Scholar
  26. Wei Song, Shiqi Zhao, Chao Zhang, Hua Wu, Haifeng Wang, Lizhen Liu, and Hanshi Wang. 2015. Exploiting collective hidden structures in webpage titles for open domain entity extraction. In Proceedings of WWW’15. 1014--1024.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Partha Pratim Talukdar, Joseph Reisinger, Marius Pasca, Deepak Ravichandran, Rahul Bhagat, and Fernando Pereira. 2008. Weakly-supervised acquisition of labeled class instances using graph random walks. In Proceedings of EMNLP’08. 582--590.Google ScholarGoogle ScholarCross RefCross Ref
  28. Fangbo Tao, Bo Zhao, Ariel Fuxman, Yang Li, and Jiawei Han. 2015. Leveraging pattern semantics for extracting entities in enterprises. In Proceedings of WWW’15. 1078--1088.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Chi Wang, Kaushik Chakrabarti, Yeye He, Kris Ganjam, Zhimin Chen, and Philip A. Bernstein. 2015. Concept expansion using web tables. In Proceedings of WWW’15. 1198--1208.Google ScholarGoogle Scholar
  30. Richard C. Wang and William W. Cohen. 2008. Iterative set expansion of named entities using the web. In Proceedings of ICDM’08. 1091--1096.Google ScholarGoogle Scholar
  31. Gu Xu, Shuang-Hong Yang, and Hang Li. 2009. Named entity mining from click-through data using weakly supervised latent Dirichlet allocation. In Proceedings of KDD’09. 1365--1374.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zhenzhong Zhang, Le Sun, and Xianpei Han. 2016. A joint model for entity set expansion and attribute extraction from web search queries. In Proceedings of AAAI’16. 3101--3107.Google ScholarGoogle Scholar
  33. Yuyan Zheng, Chuan Shi, Xiaohuan Cao, Xiaoli Li, and Bin Wu. 2017. Entity set expansion with meta path in knowledge graph. In Proceedings of PAKDD’17. 317--329.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. PatternRank+NN: A Ranking Framework Bringing User Behaviors into Entity Set Expansion from Web Search Queries

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on the Web
        ACM Transactions on the Web  Volume 14, Issue 3
        August 2020
        126 pages
        ISSN:1559-1131
        EISSN:1559-114X
        DOI:10.1145/3398019
        Issue’s Table of Contents

        Copyright © 2020 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 3 May 2020
        • Accepted: 1 February 2020
        • Revised: 1 November 2019
        • Received: 1 November 2018
        Published in tweb Volume 14, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!