ABSTRACT
This paper provides new worst-case bounds for the size and treewith of the result Q(D) of a conjunctive query Q to a database D. We derive bounds for the result size |Q(D)| in terms of structural properties of Q, both in the absence and in the presence of keys and functional dependencies. These bounds are based on a novel "coloring" of the query variables that associates a coloring number C(Q) to each query Q. Using this coloring number, we derive tight bounds for the size of Q(D) in case (i) no functional dependencies or keys are specified, and (ii) simple (one-attribute) keys are given. These results generalize recent size-bounds for join queries obtained by Atserias, Grohe, and Marx (FOCS 2008). An extension of our coloring technique also gives a lower bound for |Q(D)| in the general setting of a query with arbitrary functional dependencies. Our new coloring scheme also allows us to precisely characterize (both in the absence of keys and with simple keys) the treewidth-preserving queries--the queries for which the output treewidth is bounded by a function of the input treewidth. Finally we characterize the queries that preserve the sparsity of the input in the general setting with arbitrary functional dependencies.
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google Scholar
Digital Library
- A.V. Aho, C. Beeri, and J.D. Ullman. The theory of joins in relational databases. ACM Trans. Database Syst., 4(3):297--314, 1979. Google Scholar
Digital Library
- A.V. Aho, Y. Sagiv, and J.D. Ullman. Equivalence of relational expressions. SIAM J. of Computing, 8(2):218--246, May 1979.Google Scholar
Cross Ref
- S. Arnborg, J. Lagergren, and D. Seese. Easy problems for tree-decomposable graphs. J. Algorithms, 12(2):308--340, 1991. Google Scholar
Digital Library
- A. Atserias, M. Grohe, and D. Marx. Size bounds and query plans for relational joins. In IEEE FOCS'08. Google Scholar
Digital Library
- C. Beeri and M.Y. Vardi. A proof procedure for data dependencies. J. ACM, 31(4):718--741, 1984. Google Scholar
Digital Library
- A.K. Chandra and P.M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In ACM STOC, 1977. Google Scholar
Digital Library
- S. Chaudhuri. An overview of query optimization in relational systems. In PODS 1998. Google Scholar
Digital Library
- B. Courcelle. The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Inf. Comput., 85(1):12--75, 1990. Google Scholar
Digital Library
- A. Deutsch, L. Popa, and V. Tannen. Query reformulation with constraints. SIGMOD Rec., 35(1):65--73, 2006. Google Scholar
Digital Library
- R. Fagin, P.G. Kolaitis, R.J. Miller, and L. Popa. Data exchange: Semantics and query answering. In ICDT, 2003. Google Scholar
Digital Library
- G. Gottlob and S.T. Lee. A logical approach to multicut problems. Inf. Proc. Letters, 103(4):136--141, 2007. Google Scholar
Digital Library
- G. Gottlob, S.T. Lee, and G.J. Valiant. Size and treewidth bounds for conjunctive queries (extended version). Available from the authors upon request.Google Scholar
- G. Gottlob, R. Pichler, and F. Wei. Efficient datalog abduction through bounded treewidth. In AAAI, pages 1626--1631, 2007. Google Scholar
Digital Library
- M. Grohe and D. Marx. Constraint solving via fractional edge covers. In SODA 2006. Google Scholar
Digital Library
- P.J. Haas, J.F. Naughton, S. Seshadri, and A.N. Swami. Selectivity and cost estimation for joins based on random sampling. J. Comput. Syst. Sci., 52(3):550--569, 1996. Google Scholar
Digital Library
- M. Jarke and J. Koch. Query optimization in database systems. ACM Comput. Surv., 16(2):111--152, 1984. Google Scholar
Digital Library
- P. Kolaitis. Schema mappings, data exchange, and metadata management. In PODS, 2005. Google Scholar
Digital Library
- M. Lenzerini. Data integration: a theoretical perspective. In PODS, 2002. Google Scholar
Digital Library
- A.Y. Levy, A.O. Mendelzon, and Y. Sagiv. Answering queries using views. In PODS 1995. Google Scholar
Digital Library
- D. Maier, A.O. Mendelzon, and Y. Sagiv. Testing implications of data dependencies. ACM Trans. Database Syst., 4(4):455--469, 1979. Google Scholar
Digital Library
- F. Olken and D. Rotem. Random sampling from database files: A survey. In Proc. of Stat. and Scientific Database Management, 1990. Google Scholar
Digital Library
- N. Robertson and P.D. Seymour. Graph minors II: Algorithmic Aspects of Tree-Width. Journal of Algorithms, 7:309--322, 1986.Google Scholar
Cross Ref
- A.N. Swami and K.B. Schiefer. On the estimation of join result sizes. In Advances in Database Technology -- EDBT'94. 4th Int. Conf. on Extending Database Technology, 1994. Google Scholar
Digital Library
- M. Thorup. Structured programs have small tree-width and good register allocation. Information and Computation, 142:318--332, 1998. Google Scholar
Digital Library
Index Terms
Size and treewidth bounds for conjunctive queries
Recommendations
Size and Treewidth Bounds for Conjunctive Queries
This article provides new worst-case bounds for the size and treewith of the result Q(D) of a conjunctive query Q applied to a database D. We derive bounds for the result size |Q(D)| in terms of structural properties of Q, both in the absence and in the ...
Efficient approximations of conjunctive queries
PODS '12: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database SystemsWhen finding exact answers to a query over a large database is infeasible, it is natural to approximate the query by a more efficient one that comes from a class with good bounds on the complexity of query evaluation. In this paper we study such ...
Size Bounds for Factorised Representations of Query Results
We study two succinct representation systems for relational data based on relational algebra expressions with unions, Cartesian products, and singleton relations: f-representations, which employ algebraic factorisation using distributivity of product ...






Comments