skip to main content
research-article

Exploiting and Maintaining Materialized Views for XML Keyword Queries

Published:01 December 2012Publication History
Skip Abstract Section

Abstract

Keyword query is a user-friendly mechanism for retrieving useful information from XML data in Web and scientific applications. Inspired by the performance benefits of exploiting materialized views when processing structured queries, we investigate the feasibility and present a general framework for answering XML keyword queries using materialized views. Then we develop an XML keyword search engine that leverages materialized views for query evaluation and maintains materialized views incrementally upon XML data update. Experimental evaluation demonstrates the significance and efficiency of our approach.

References

  1. Arion, A., Benzaken, V., Manolescu, I., and Papakonstantinou, Y. 2007. Structured materialized views for xml queries. In Proceedings of the International Conference on Very Large Databases (VLDB’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Balmin, A., Ozcan, F., Beyer, K. S., and Cochrane, R. J. 2004. A framework for using materialized xpath views in xml query processing. In Proceedings of the International Conference on Very Large Databases (VLDB’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bao, Z., Ling, T. W., Chen, B., and Lu, J. 2009. Effective xml keyword search with relevance oriented ranking. In Proceedings of the International Conference on Data Engineering (ICDE’09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen, L. J. and Papakonstantinou, Y. 2010. Supporting top-k keyword search in xml databases. In Proceedings of the International Conference on Data Engineering (ICDE’10).Google ScholarGoogle Scholar
  5. Chen, Y., Wang, W., Liu, Z., and Lin, X. 2009. Keyword search on structured and semi-structured data. In Proceedings of the ACM SIGMOD Conference on Management of Data. 1005--1010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chen, Y., Wang, W., and Liu, Z. 2011. Keyword-Based search and exploration on databases. In Proceedings of the International Conference on Very Large Databases (ICDE’11). 1380--1383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cohen, E., Kaplan, H., and Milo, T. 2002. Labeling dynamic xml trees. In Proceedings of the ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’02). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cohen, S., Mamou, J., Kanza, Y., and Sagiv, Y. 2003. XSEarch: A semantic search engine for xml. http://www.vldb.org/conf/2003/papers/S03P02.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2001. Introduction to Algorithms 2nd Ed. The MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Fan, W., Geerts, F., Jia, X., and Kementsietsidis, A. 2007. Rewriting regular xpath queries on xml views. In Proceedings of the International Conference on Data Engineering (ICDE’07).Google ScholarGoogle Scholar
  11. Feng, J., Ta, N., Zhang, Y., and Li, G. 2007. Exploit sequencing views in semantic cache to accelerate xpath query evaluation. In Proceedings of the International Conference on World Wide Web (WWW’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Guo, L., Shao, F., Botev, C., and Shanmugasundaram, J. 2003. XRANK: Ranked keyword search over xml documents. In Proceedings of the ACM SIGMOD Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hristidis, V., Koudas, N., Papakonstantinou, Y., and Srivastava, D. 2006. Keyword proximity search in xml trees. IEEE Trans. Knowl. Data Engin. 18, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Huang, Y., Liu, Z., and Chen, Y. 2008. Query biased snippet generation in xml search. In Proceedings of the ACM SIGMOD Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lempel, R. and Moran, S. 2003. Predictive caching and prefetching of query results in search engines. In Proceedings of the International Conference on World Wide Web (WWW’03). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Li, C., Ling, T. W., and Hu, M. 2006. Efficient processing of updates in dynamic xml data. In Proceedings of the International Conference on Data Engineering (ICDE’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Li, G., Feng, J., Wang, J., and Zhou, L. 2007a. Effective keyword search for valuable lcas over xml documents. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Li, G., Ooi, B. C., Feng, J., Wang, J., and Zhou, L. 2008. EASE: Efficient and adaptive keyword search on unstructured, semi-structured and structured data. In Proceedings of the ACM SIGMOD Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Li, Y., Yang, H., and Jagadish, H. V. 2007b. NaLIX: A generic natural language research environment for xml data. ACM Trans. Datab. Syst. 32, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Li, Y., Yu, C., and Jagadish, H. V. 2004. Schema-Free xquery. In Proceedings of the International Conference on Very Large Databases (VLDB’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Liu, Z. and Chen, Y. 2007. Identifying meaningful return information for xml keyword search. In Proceedings of the ACM Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Liu, Z. and Chen, Y. 2008a. Answering keyword queries on xml using materialized views. In Proceedings of the International Conference on Data Engineering (ICDE’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Liu, Z. and Chen, Y. 2008b. Reasoning and identifying relevant matches for xml keyword search. In Proceedings of the International Conference on Very Large Databases (VLDB’08).Google ScholarGoogle Scholar
  24. Liu, Z. and Chen, Y. 2010. Return specification interference and result clustering for keyword search on xml. ACM Trans. Datab. Syst. 35, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Liu, Z. and Chen, Y. 2011. Processing keyword search on xml: A survey. World Wide Web 14, 5--6, 671--707. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Liu, Z. and Chen, Y. 2012. Differentiating search results on structured data. ACM Trans. Datab. Syst. 37, 1, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Liu, Z., Huang, Y., and Chen, Y. 2010a. Improving xml search by generating and utilizing informative result snippets. ACM Trans. Datab. Syst. 35, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Liu, Z., Shao, Q., and Chen, Y. 2010b. Searching workflows with hierarchical views. Proc. VLDB 3, 1, 918--927. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Liu, Z., Natarajan, S., and Chen, Y. 2011. Query expansion based on clustered results. Proc. VLDB 4, 6, 350--361. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Luo, Y., Lin, X., Wang, W., and Zhou, X. 2007. SPARK: Top-k keyword query in relational databases. In Proceedings of the ACM SIGMOD Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Mandhani, B. and Suciu, D. 2005. Query caching and view selection for xml databases. In Proceedings of the International Conference on Very Large Databases (VLDB’05). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. O’Neil, P., ONeil, E., Pal, S., Cseri, I., and Schaller, G. 2004. ORDPATHs: Insert-Friendly xml node labels. In Proceedings of the ACM SIGMOD Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Onose, N., Deutsch, A., Papakonstantinou, Y., and Curtmola, E. 2006. Rewriting nested xml queries using nested views. In Proceedings of the ACM SIGMOD Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Saraiva, P.-C., de Moura, E. S., Ziviani, N., Meira, W., Fonseca, R., and RibeiroNeto, B. 2007. Rank-Preserving two-level caching for scalable search engines. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Sawires, A., Tatemura, J., Po, O., Agrawal, D., and Candan, K. S. 2005. Incremental maintenance of path-expression views. In Proceedings of the ACM SIGMOD Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Sawires, A., Tatemura, J., Po, O., Agrawal, D., Abbadi, A. E., and Candan, K. S. 2006. Maintaining xpath views in loosely coupled systems. In Proceedings of the International Conference on Very Large Databases (VLDB’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Shao, F., Guo, L., and Botev, C. 2007. Efficient keyword search over virtual xml views. In Proceedings of the International Conference on Very Large Databases (VLDB’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Sun, C., Chan, C.-Y., and Goenka, A. 2007. Multiway slca-based keyword search in xml data. In Proceedings of the International Conference on World Wide Web (WWW’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Tang, N., Yu, J. X., Ozsu, M. T., Choi, B., and Wong, K.-F. 2008. Multiple materialized view selection for xpath query rewriting. In Proceedings of the International Conference on Data Engineering (ICDE’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Tatarinov, I., Viglas, S., Beyer, K. S., Shanmugasundaram, J., Shekita, E. J., and Zhang, C. 2002. Storing and querying ordered xml using a relational database system. In Proceedings of the ACM SIGMOD Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Xu, L., Ling, T. W., Wu, H., and Bao, Z. 2009. DDE: From dewey to a fully dynamic xml labeling scheme. In Proceedings of the ACM SIGMOD Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Xu, W. and Ozsoyoglu, Z. M. 2005. Rewriting xpath queries using materialized views. In Proceedings of the International Conference on Very Large Databases (VLDB’05). Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Xu, Y. and Papakonstantinou, Y. 2005. Efficient keyword search for smallest lcas in xml databases. In Proceedings of the ACM SIGMOD Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xu, Y. and Papakonstantinou, Y. 2008. Efficient lca based keyword search in xml data. In Proceedings of the International Conference on Extending Database Technology (EDBT’08). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploiting and Maintaining Materialized Views for XML Keyword Queries

      Recommendations

      Reviews

      Scott Arthur Moody

      The concept of Extensible Markup Language (XML) keyword searching is an interesting subset of query processing, as XML (along with its variants) has become the main web data representation format. This paper analyzes "how to efficiently evaluate and optimize XML keyword queries" through the semantic caching of materialized views. Traditional keyword searching deals with the metadata of an XML (or web) document. The authors not only extend the search across the entire document, but also apply optimizations to manage the huge amount of Internet data. Materialized views are queries that have been cached, so they are immediately available to speed up query processing. The twist is that queries can be reused for other than a duplication of the exact same query. The paper describes an innovative new approach using a search engine devised by the authors, which "can answer queries using materialized views and maintain [those views] incrementally upon XML data update[s]." In traditional XML keyword search, a query is converted into an XQuery, but this approach "incurs excessively high time complexity" and requires existing data schemas. The authors describe a general framework where the relevant materialized views can be created and continually updated with respect to dynamic source data. Experimental evaluation results show the significant performance improvements of answering queries using materialized views combined with the efficiency of incrementally maintaining views. In all, this paper is very detailed, with theorems for these new materialized view approaches and algorithms for implementing the authors' approach. Combined with the experimental results, this could be an important addition to the XML keyword query research field. Online Computing Reviews Service

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Internet Technology
        ACM Transactions on Internet Technology  Volume 12, Issue 2
        December 2012
        94 pages
        ISSN:1533-5399
        EISSN:1557-6051
        DOI:10.1145/2390209
        Issue’s Table of Contents

        Copyright © 2012 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 December 2012
        • Accepted: 1 October 2012
        • Revised: 1 September 2011
        • Received: 1 June 2009
        Published in toit Volume 12, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed
      • Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!