ABSTRACT
Static analysis is a fundamental task in query optimization. In this paper we study static analysis and optimization techniques for SPARQL, which is the standard language for querying Semantic Web data. Of particular interest for us is the optionality feature in SPARQL. It is crucial in Semantic Web data management, where data sources are inherently incomplete and the user is usually interested in partial answers to queries. This feature is one of the most complicated constructors in SPARQL and also the one that makes this language depart from classical query languages such as relational conjunctive queries. We focus on the class of well-designed SPARQL queries, which has been proposed in the literature as a fragment of the language with good properties regarding query evaluation. We first propose a tree representation for SPARQL queries, called pattern trees, which captures the class of well-designed SPARQL graph patterns and which can be considered as a query execution plan. Among other results, we propose several transformation rules for pattern trees, a simple normal form, and study equivalence and containment. We also study the enumeration and counting problems for this class of queries.
- D. J. Abadi, A. Marcus, S. Madden, and K. J. Hollenbach. Scalable semantic web data management using vertical partitioning. In VLDB, pages 411--422. ACM, 2007. Google Scholar
Digital Library
- R. Angles and C. Gutierrez. The expressive power of SPARQL. In ISWC, pages 114--129. Springer, 2008. Google Scholar
Digital Library
- M. Arenas and J. Pérez. Querying Semantic Web data with SPARQL. In PODS, pages 305--316. ACM, 2011. Google Scholar
Digital Library
- G. Bagan, A. Durand, and E. Grandjean. On acyclic conjunctive queries and constant delay enumeration. In CSL, pages 208--222. Springer, 2007. Google Scholar
Digital Library
- M. Bauland, P. Chapdelaine, N. Creignou, M. Hermann, and H. Vollmer. An algebraic approach to the complexity of generalized conjunctive queries. In SAT 2004 - Revised Selected Papers, pages 30--45. Springer, 2005. Google Scholar
Digital Library
- T. Berners-Lee. Linked data -- design issues. http://www.w3.org/DesignIssues/LinkedData.html, 2006.Google Scholar
- C. Bizer, T. Heath, and T. Berners-Lee. Linked data - the story so far. Int. J. Semantic Web Inf. Syst., 5(3):1--22, 2009.Google Scholar
Cross Ref
- A. K. Chandra and P. M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In STOC, pages 77--90. ACM, 1977. Google Scholar
Digital Library
- M. Chekol, J. Euzenat, P. Genevès, and N. Layaïda. PSPARQL query containment. In DBPL, 2011.Google Scholar
- C. Chekuri and A. Rajaraman. Conjunctive query containment revisited. Theor. Comput. Sci., 239(2):211--229, 2000. Google Scholar
Digital Library
- S. Cohen and I. Fadida and Y. Kanza and B. Kimelfeld and Y. Sagiv. Full Disjunctions: Polynomial-Delay Iterators in Action. In VLDB, pages 739--750. ACM, 2006. Google Scholar
Digital Library
- A. Durand, M. Hermann, and P. G. Kolaitis. Subtractive reductions and complete problems for counting complexity classes. Theor. Comput. Sci., 340(3):496--513, 2005. Google Scholar
Digital Library
- J. Flum, M. Frick, and M. Grohe. Query evaluation via tree-decompositions. J. ACM, 49(6):716--752, 2002. Google Scholar
Digital Library
- J. Flum and M. Grohe. The parameterized complexity of counting problems. SIAM J. Comput., 33(4):892--922, 2004. Google Scholar
Digital Library
- M. A. Gallego, J. D. Fernández, M. A. Martínez-Prieto, and P. de la Fuente. An empirical study of real-world SPARQL queries. In USEWOD, 2011.Google Scholar
- G. Gottlob, N. Leone, and F. Scarcello. A comparison of structural CSP decomposition methods. Artif. Intell., 124(2):243--282, 2000. Google Scholar
Digital Library
- G. Gottlob, N. Leone, and F. Scarcello. Hypertree decompositions and tractable queries. J. Comput. Syst. Sci., 64(3):579--627, 2002.Google Scholar
Digital Library
- G. Greco and F. Scarcello. The power of tree projections: local consistency, greedy algorithms, and larger islands of tractability. In PODS 2010, pages 327--338. ACM, 2010. Google Scholar
Digital Library
- C. Gutierrez, C. A. Hurtado, A. O. Mendelzon, and J. Pérez. Foundations of semantic web databases. J. Comput. Syst. Sci., 77(3):520--541, 2011. Google Scholar
Digital Library
- L. A. Hemaspaandra and H. Vollmer. The satanic notations: Counting classes beyond \#P and other definitional adventures. SIGACT News, Complexity Theory Column 8, 26(1):2--13, 1995. Google Scholar
Digital Library
- Y. Kanza and W. Nutt and Y. Sagiv. Querying Incomplete Information in Semistructured Data. J. Comput. Syst. Sci., 64(3):655--693, 2002.Google Scholar
Digital Library
- P.-Å. Larson and J. Zhou. View matching for outer-join views. In VLDB, pages 445--456. ACM, 2005. Google Scholar
Digital Library
- O. Lassila and R. Swick. Resource description framework (RDF) model and syntax. W3C Recommnedation, January 1999.Google Scholar
- T. Neumann and G. Weikum. The rdf-3x engine for scalable management of rdf data. VLDB J., 19(1):91--113, 2010. Google Scholar
Digital Library
- J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. In ISWC, pages 30--43. Springer, 2006. Google Scholar
Digital Library
- J. Pérez, M. Arenas, and C. Gutierrez. Semantics of SPARQL. Technical Report, Universidad de Chile TR/DCC-2006--17, October 2006.Google Scholar
- J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. ACM Trans. Database Syst., 34(3), 2009. Google Scholar
Digital Library
- F. Picalausa and S. Vansummeren. What are real SPARQL queries like? In SWIM, pages 7:1--7:6. ACM, 2011. Google Scholar
Digital Library
- R. Pichler and S. Skritek. Tractable counting of the answers to conjunctive queries. In AMW. CEUR-WS.org, 2011.Google Scholar
- A. Polleres. From SPARQL to rules (and back). In WWW, pages 787--796. ACM, 2007. Google Scholar
Digital Library
- E. Prud'Hommeaux and A. Seaborne. SPARQL query language for RDF. W3C Recommnedation, January 2008.Google Scholar
- M. Schmidt, T. Hornung, N. Küchlin, G. Lausen, and C. Pinkel. An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In ISWC, pages 82--97. Springer, 2008. Google Scholar
Digital Library
- M. Schmidt, T. Hornung, G. Lausen, and C. Pinkel. SP$^2$Bench: A SPARQL performance benchmark. In ICDE, pages 222--233. IEEE, 2009. Google Scholar
Digital Library
- M. Schmidt, M. Meier, and G. Lausen. Foundations of SPARQL query optimization. In ICDT, pages 4--33. ACM, 2010. Google Scholar
Digital Library
- G. Serfiotis, I. Koffina, V. Christophides, and V. Tannen. Containment and minimization of RDF/S query patterns. In ISWC, pages 607--623. Springer, 2005. Google Scholar
Digital Library
- L. Sidirourgos, R. Goncalves, M. L. Kersten, N. Nes, and S. Manegold. Column-store support for RDF data management: not all swans are white. PVLDB, 1(2):1553--1563, 2008. Google Scholar
Digital Library
- M. Stocker, A. Seaborne, A. Bernstein, C. Kiefer, and D. Reynolds. Sparql basic graph pattern optimization using selectivity estimation. In WWW, pages 595--604. ACM, 2008. Google Scholar
Digital Library
- J. D. Ullman. Information integration using logical views. In ICDT, pages 19--40. Springer, 1997. Google Scholar
Digital Library
- L. G. Valiant. The complexity of enumeration and reliability problems. SIAM J. Comput., 8(3):410--421, 1979.Google Scholar
Digital Library
- C. Weiss, P. Karras, and A. Bernstein. Hexastore: sextuple indexing for semantic web data management. PVLDB, 1(1):1008--1019, 2008. Google Scholar
Digital Library
- M. Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82--94. IEEE, 1981. Google Scholar
Digital Library
- http://data.gov.uk.Google Scholar
- http://www.data.gov.Google Scholar
- http://DBpedia.org/sparql.Google Scholar
- ARQ. http://sourceforge.net/projects/jena/files/ARQ/.Google Scholar
Index Terms
Static analysis and optimization of semantic web queries
Recommendations
Static analysis and optimization of semantic web queries
Invited papers issueStatic analysis is a fundamental task in query optimization. In this article we study static analysis and optimization techniques for SPARQL, which is the standard language for querying Semantic Web data. Of particular interest for us is the optionality ...
RDF, Jena, SparQL and the 'Semantic Web'
SIGUCCS '09: Proceedings of the 37th annual ACM SIGUCCS fall conference: communication and collaborationThe Resource Description Format (RDF) is used to represent information modeled as a "graph": a set of individual objects, along with a set of connections among those objects. In that role, RDF is one of the pillars of the so-called Semantic Web. This ...
Expressive Languages for Querying the Semantic Web
Best of PODS 2017, Best of ICDT 2017 and Regular PapersThe problem of querying RDF data is a central issue for the development of the Semantic Web. The query language SPARQL has become the standard language for querying RDF since its W3C standardization in 2008. However, the 2008 version of this language ...






Comments