skip to main content
10.1145/1807085.1807121acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Understanding queries in a search database system

Published:06 June 2010Publication History

ABSTRACT

It is well known that a search engine can significantly benefit from an auxiliary database, which can suggest interpretations of the search query by means of the involved concepts and their interrelationship. The difficulty is to translate abstract notions like concept and interpretation into a concrete search algorithm that operates over the auxiliary database. To surpass existing heuristics, there is a need for a formal basis, which is realized in this paper through the framework of a search database system, where an interpretation is identified as a parse. It is shown that the parses of a query can be generated in polynomial time in the combined size of the input and the output, even if parses are restricted to those having a nonempty evaluation. Identifying that one parse is more specific than another is important for ranking answers, and this framework captures the precise semantics of being more specific; moreover, performing this comparison between parses is tractable. Lastly, the paper studies the problem of finding the most specific parses. Unfortunately, this problem turns out to be intractable in the general case. However, under reasonable assumptions, the parses can be enumerated in an order of decreasing specificity, with polynomial delay and polynomial space.

References

  1. B. Aditya, G. Bhalotia, S. Chakrabarti, A. Hulgeri, C. Nakhe, Parag, and S. Sudarshan. BANKS: Browsing and keyword searching in relational databases. In VLDB, pages 1083--1086. Morgan Kaufmann, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Amer-Yahia, S. Cho, L. V. S. Lakshmanan, and D. Srivastava. Tree pattern query minimization. VLDB J., 11(4):315--331, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Bear, D. J. Israel, J. Petit, and D. L. Martin. Using information extraction to improve document retrieval. In TREC, pages 367--377, 1997.Google ScholarGoogle Scholar
  4. A. Z. Broder. A taxonomy of Web search. SIGIR Forum, 36(2):3--10, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Y. Chen, W. Wang, Z. Liu, and X. Lin. Keyword search on structured and semi-structured data. In SIGMOD Conference, pages 1005--1010. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. R. Garey and D. S. Johnson. Two-processor scheduling with start-times and deadlines. SIAM J. Comput., 6(3):416--426, 1977.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. R. Garey and D. S. Johnson. "Strong" NP-completeness results: Motivation, examples, and implications. J. ACM, 25(3):499--508, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Golenberg, B. Kimelfeld, and Y. Sagiv. Keyword proximity search in complex data graphs. In SIGMOD Conference, pages 927--940. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. A. Hearst. Direction-based text interpretation as an information access refinement. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 257--274. Erlbaum, Hillsdale, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. V. Hristidis and Y. Papakonstantinou. DISCOVER: Keyword search in relational databases. In VLDB, pages 670--681. Morgan Kaufmann, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Johnson, M. Yannakakis, and C. Papadimitriou. On generating all maximal independent sets. Information Processing Letters, 27:119--123, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, pages 505--516. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Kandogan, R. Krishnamurthy, S. Raghavan, S. Vaithyanathan, and H. Zhu. Avatar semantic search: a database approach to information retrieval. In SIGMOD Conference, pages 790--792. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Kimelfeld and Y. Sagiv. Finding and approximating top-k answers in keyword proximity search. In PODS, pages 173--182. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Kimelfeld and Y. Sagiv. Efficiently enumerating results of keyword search over data graphs. Inf. Syst., 33(4-5):335--359, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. Klimt and Y. Yang. Introducing the Enron corpus. In CEAS, 2004.Google ScholarGoogle Scholar
  17. R. Krishnamurthy, Y. Li, S. Raghavan, F. Reiss, S. Vaithyanathan, and H. Zhu. SystemT: a system for declarative information extraction. SIGMOD Record, 37(4):7--13, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. E. L. Lawler. A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Management Science, 18:401--405, 1972.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. D. Lewis. Text representation for intelligent text retrieval: A classification--oriented view. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 179--197. Erlbaum, Hillsdale, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Li, R. Krishnamurthy, S.Vaithyanathan, and H. V. Jagadish. Getting work done on the Web: supporting transactional queries. In SIGIR, pages 557--564. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Luo, W. Wang, and X. Lin. SPARK: A keyword search engine on relational databases. In ICDE, pages 1552--1555. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. G. Murty. An algorithm for ranking all the assignments in order of increasing costs. Operations Research, 16:682--687, 1968.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Pang and L. Lee. Using very simple statistics for review search: An exploration. In Proceedings of COLING: Companion volume: Posters, pages 73--76, 2008.Google ScholarGoogle Scholar
  24. G. Petasis, V. Karkaletsis, G. Paliouras, and C. D. Spyropoulos. Learning context-free grammars to extract relations from text. In ECAI, volume 178 of Frontiers in Artificial Intelligence and Applications, pages 303--307. IOS Press, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Qin, J. X. Yu, and L. Chang. Keyword search in databases: the power of RDBMS. In SIGMOD Conference, pages 681--694. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Qiu and H.-P. Frei. Concept based query expansion. In SIGIR, pages 160--169. ACM, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. F. Reiss, S. Raghavan, R. Krishnamurthy, H. Zhu, and S. Vaithyanathan. An algebraic approach to rule-based information extraction. In ICDE, pages 933--942. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. K. Sparck Jones. Assumptions and issues in text-based retrieval. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 157--177. Erlbaum, Hillsdale, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Y. Yen. Finding the k shortest loopless paths in a network. Management Science, 17:712--716, 1971.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. H. Zhu, S. Raghavan, S. Vaithyanathan, and A. Löser. Navigating the intranet with high precision. In WWW, pages 491--500. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Understanding queries in a search database system

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PODS '10: Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
      June 2010
      350 pages
      ISBN:9781450300339
      DOI:10.1145/1807085

      Copyright © 2010 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 June 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate476of1,835submissions,26%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!