ABSTRACT
It is well known that a search engine can significantly benefit from an auxiliary database, which can suggest interpretations of the search query by means of the involved concepts and their interrelationship. The difficulty is to translate abstract notions like concept and interpretation into a concrete search algorithm that operates over the auxiliary database. To surpass existing heuristics, there is a need for a formal basis, which is realized in this paper through the framework of a search database system, where an interpretation is identified as a parse. It is shown that the parses of a query can be generated in polynomial time in the combined size of the input and the output, even if parses are restricted to those having a nonempty evaluation. Identifying that one parse is more specific than another is important for ranking answers, and this framework captures the precise semantics of being more specific; moreover, performing this comparison between parses is tractable. Lastly, the paper studies the problem of finding the most specific parses. Unfortunately, this problem turns out to be intractable in the general case. However, under reasonable assumptions, the parses can be enumerated in an order of decreasing specificity, with polynomial delay and polynomial space.
- B. Aditya, G. Bhalotia, S. Chakrabarti, A. Hulgeri, C. Nakhe, Parag, and S. Sudarshan. BANKS: Browsing and keyword searching in relational databases. In VLDB, pages 1083--1086. Morgan Kaufmann, 2002. Google Scholar
Digital Library
- S. Amer-Yahia, S. Cho, L. V. S. Lakshmanan, and D. Srivastava. Tree pattern query minimization. VLDB J., 11(4):315--331, 2002. Google Scholar
Digital Library
- J. Bear, D. J. Israel, J. Petit, and D. L. Martin. Using information extraction to improve document retrieval. In TREC, pages 367--377, 1997.Google Scholar
- A. Z. Broder. A taxonomy of Web search. SIGIR Forum, 36(2):3--10, 2002. Google Scholar
Digital Library
- Y. Chen, W. Wang, Z. Liu, and X. Lin. Keyword search on structured and semi-structured data. In SIGMOD Conference, pages 1005--1010. ACM, 2009. Google Scholar
Digital Library
- M. R. Garey and D. S. Johnson. Two-processor scheduling with start-times and deadlines. SIAM J. Comput., 6(3):416--426, 1977.Google Scholar
Digital Library
- M. R. Garey and D. S. Johnson. "Strong" NP-completeness results: Motivation, examples, and implications. J. ACM, 25(3):499--508, 1978. Google Scholar
Digital Library
- K. Golenberg, B. Kimelfeld, and Y. Sagiv. Keyword proximity search in complex data graphs. In SIGMOD Conference, pages 927--940. ACM, 2008. Google Scholar
Digital Library
- M. A. Hearst. Direction-based text interpretation as an information access refinement. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 257--274. Erlbaum, Hillsdale, 1992. Google Scholar
Digital Library
- V. Hristidis and Y. Papakonstantinou. DISCOVER: Keyword search in relational databases. In VLDB, pages 670--681. Morgan Kaufmann, 2002. Google Scholar
Digital Library
- D. Johnson, M. Yannakakis, and C. Papadimitriou. On generating all maximal independent sets. Information Processing Letters, 27:119--123, 1988. Google Scholar
Digital Library
- V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, pages 505--516. ACM, 2005. Google Scholar
Digital Library
- E. Kandogan, R. Krishnamurthy, S. Raghavan, S. Vaithyanathan, and H. Zhu. Avatar semantic search: a database approach to information retrieval. In SIGMOD Conference, pages 790--792. ACM, 2006. Google Scholar
Digital Library
- B. Kimelfeld and Y. Sagiv. Finding and approximating top-k answers in keyword proximity search. In PODS, pages 173--182. ACM, 2006. Google Scholar
Digital Library
- B. Kimelfeld and Y. Sagiv. Efficiently enumerating results of keyword search over data graphs. Inf. Syst., 33(4-5):335--359, 2008. Google Scholar
Digital Library
- B. Klimt and Y. Yang. Introducing the Enron corpus. In CEAS, 2004.Google Scholar
- R. Krishnamurthy, Y. Li, S. Raghavan, F. Reiss, S. Vaithyanathan, and H. Zhu. SystemT: a system for declarative information extraction. SIGMOD Record, 37(4):7--13, 2008. Google Scholar
Digital Library
- E. L. Lawler. A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Management Science, 18:401--405, 1972.Google Scholar
Digital Library
- D. D. Lewis. Text representation for intelligent text retrieval: A classification--oriented view. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 179--197. Erlbaum, Hillsdale, 1992. Google Scholar
Digital Library
- Y. Li, R. Krishnamurthy, S.Vaithyanathan, and H. V. Jagadish. Getting work done on the Web: supporting transactional queries. In SIGIR, pages 557--564. ACM, 2006. Google Scholar
Digital Library
- Y. Luo, W. Wang, and X. Lin. SPARK: A keyword search engine on relational databases. In ICDE, pages 1552--1555. IEEE, 2008. Google Scholar
Digital Library
- K. G. Murty. An algorithm for ranking all the assignments in order of increasing costs. Operations Research, 16:682--687, 1968.Google Scholar
Digital Library
- B. Pang and L. Lee. Using very simple statistics for review search: An exploration. In Proceedings of COLING: Companion volume: Posters, pages 73--76, 2008.Google Scholar
- G. Petasis, V. Karkaletsis, G. Paliouras, and C. D. Spyropoulos. Learning context-free grammars to extract relations from text. In ECAI, volume 178 of Frontiers in Artificial Intelligence and Applications, pages 303--307. IOS Press, 2008. Google Scholar
Digital Library
- L. Qin, J. X. Yu, and L. Chang. Keyword search in databases: the power of RDBMS. In SIGMOD Conference, pages 681--694. ACM, 2009. Google Scholar
Digital Library
- Y. Qiu and H.-P. Frei. Concept based query expansion. In SIGIR, pages 160--169. ACM, 1993. Google Scholar
Digital Library
- F. Reiss, S. Raghavan, R. Krishnamurthy, H. Zhu, and S. Vaithyanathan. An algebraic approach to rule-based information extraction. In ICDE, pages 933--942. IEEE, 2008. Google Scholar
Digital Library
- K. Sparck Jones. Assumptions and issues in text-based retrieval. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 157--177. Erlbaum, Hillsdale, 1992. Google Scholar
Digital Library
- J. Y. Yen. Finding the k shortest loopless paths in a network. Management Science, 17:712--716, 1971.Google Scholar
Digital Library
- H. Zhu, S. Raghavan, S. Vaithyanathan, and A. Löser. Navigating the intranet with high precision. In WWW, pages 491--500. ACM, 2007. Google Scholar
Digital Library
Index Terms
Understanding queries in a search database system
Recommendations
Identifying popular search goals behind search queries to improve web search ranking
AIRS'11: Proceedings of the 7th Asia conference on Information Retrieval TechnologyWeb users usually have a certain search goal before they submit a search query. However, many laypersons can't transform their search goals into suitable queries. Thus, understanding original search goals behind a query is very important for search ...
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge managementThis work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...






Comments