ABSTRACT
We revisit the standard chase procedure, studying its properties and applicability to classical database problems. We settle (in the negative) the open problem of decidability of termination of the standard chase, and we provide sufficient termination conditions which are strictly less over-conservative than the best previously known. We investigate the adequacy of the standard chase for checking query containment under constraints, constraint implication and computing certain answers in data exchange, gaining a deeper understanding by separating the algorithm from its result. We identify the properties of the chase result that are essential to the above applications, and we introduce the more general notion of F-universal model set, which supports query and constraint languages that are closed under a class F of mappings. By choosing F appropriately, we extend prior results to existential first-order queries and ∀∃-firstorder constraints. We show that the standard chase is incomplete for finding universal model sets, and we introduce the extended core chase which is complete, i.e. finds an F-universal model set when it exists. A key advantage of the new chase is that the same algorithm can be applied for all mapping classes F of interest, simply by modifying the set of constraints given as input. Even when restricted to the typical input in prior work, the new chase supports certain answer computation and containment/implication tests in strictly more cases than the incomplete standard chase.
- S. Abiteboul and O. M. Duschka. Complexity of answering queries using materialized views. In PODS, 1998. Google Scholar
Digital Library
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison Wesley, 1995. Google Scholar
Digital Library
- A. V. Aho, C. Beeri, and J. D. Ullman. The theory of joins in relational databases. ACM Trans. Database Syst., 4(3), 1979. Google Scholar
Digital Library
- C. Beeri and M. Y. Vardi. A proof procedure for data dependencies. J. ACM, 31(4):718--741, 1984. Google Scholar
Digital Library
- A. Calì, D. Calvanese, G. D. Giacomo, and M. Lenzerini. Data integration under integrity constraints. Inf. Syst., 29(2), 2004. Google Scholar
Digital Library
- A. Calì, D. Lembo, and R. Rosati. Query rewriting and answering under constraints in data integration systems. In IJCAI, 2003.Google Scholar
- A. K. Chandra and P. M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In STOC, 1977. Google Scholar
Digital Library
- A. Deutsch, B. Ludaescher, and A. Nash. Rewriting queries using views with access patterns under integrity constraints. In ICDT, 2005. Google Scholar
Digital Library
- A. Deutsch, A. Nash, and J. Remmel. The Chase Revisited (full version). UCSD Tech. Report 2008, http://db.ucsd.edu.Google Scholar
- A. Deutsch and V. Tannen. Mars: A system for publishing xml from mixed and redundant storage. In VLDB, pages 201--212, 2003. Google Scholar
Digital Library
- A. Deutsch and V. Tannen. Reformulation of XML Queries and Constraints. In ICDT, 2003. Google Scholar
Digital Library
- R. Fagin. Horn clauses and database dependencies. JACM, 29(4),'82. Google Scholar
Digital Library
- R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data Exchange: Semantics and Query Answering. ICDT 2003, full version in Theor. Comput. Sci. 336(1): 89--124 (2005). Google Scholar
Digital Library
- R. Fagin, P. G. Kolaitis, and L. Popa. Data Exchange: Getting to the Core. In PODS, 2003. Full version in TODS, 30(1), 2005. Google Scholar
Digital Library
- A. Fuxman, P. G. Kolaitis, R. J. Miller, and W. C. Tan. Peer data exchange. In PODS, 2005. Full version in TODS, 31(4), 2006. Google Scholar
Digital Library
- G. Gottlob and A. Nash. Data exchange: Computing cores in polynomial time. In PODS, 2006. Google Scholar
Digital Library
- A. Y. Halevy, Z. G. Ives, D. Suciu, and I. Tatarinov. Schema Mediation in Peer Data Management Systems. ICDE 2003.Google Scholar
- P. Hell and J. Nešetřil. The core of a graph. Discr. Math., 109(1-3):117--126, 1992. Google Scholar
Digital Library
- P. G. Kolaitis, J. Panttaja, and W. C. Tan. The complexity of data exchange. In PODS, pages 30--39, 2006. Google Scholar
Digital Library
- M. Lenzerini. Data Integration: A Theoretical Perspective. In ACM PODS, pages 233--246, 2002. Google Scholar
Digital Library
- Maier, Sagiv, and Yannakakis. On the complexity of testing implication of functional and join dependencies. J. ACM, 1981. Google Scholar
Digital Library
- D. Maier, A. O. Mendelzon, and Y. Sagiv. Testing implications of data dependencies. ACM Trans. Database Syst., 4(4):455--469, 1979. Google Scholar
Digital Library
- A. Nash, A. Deutsch, and J. Remmel. Data exchange, data integration, and the chase. UCSD Tech. Report CS2006-0859, 2006.Google Scholar
- B. Rossman. Existential positive types and preservation under homomorphisisms. In LICS, pages 467--476, 2005. Google Scholar
Digital Library
- M. Vardi. Inferring multivalued dependencies from functional and join dependencies. Acta Informatica, 1983.Google Scholar
Cross Ref
- C. Yu and L. Popa. Constraint-Based XML Query Rewriting For Data Integration. In SIGMOD, pages 371--382, 2004. Google Scholar
Digital Library
Index Terms
The chase revisited
Recommendations
Benchmarking the Chase
PODS '17: Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsThe chase is a family of algorithms used in a number of data management tasks, such as data exchange, answering queries under dependencies, query reformulation with constraints, and data cleaning. It is well established as a theoretical tool for ...
A general datalog-based framework for tractable query answering over ontologies
PODS '09: Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsIn this paper, we introduce a family of expressive extensions of Datalog, called Datalog+/-, as a new paradigm for query answering over ontologies. The Datalog+/- family admits existentially quantified variables in rule heads, and has suitable ...
The backchase revisited
Semantic query optimization is the process of finding equivalent rewritings of an input query given constraints that hold in a database instance. In this paper, we report about a Chase & Backchase (C&B) algorithm strategy that generalizes and improves ...






Comments