ABSTRACT
While incomplete information is ubiquitous in all data models - especially in applications involving data translation or integration - our understanding of it is still not completely satisfactory. For example, even such a basic notion as certain answers for XML queries was only introduced recently, and in a way seemingly rather different from relational certain answers.
The goal of this paper is to introduce a general approach to handling incompleteness, and to test its applicability in known data models such as relations and documents. The approach is based on representing degrees of incompleteness via semantics-based orderings on database objects. We use it to both obtain new results on incompleteness and to explain some previously observed phenomena. Specifically we show that certain answers for relational and XML queries are two instances of the same general concept; we describe structural properties behind the naive evaluation of queries; answer open questions on the existence of certain answers in the XML setting; and show that previously studied ordering-based approaches were only adequate for SQL's primitive view of nulls. We define a general setting that subsumes relations and documents to help us explain in a uniform way how to compute certain answers, and when good solutions can be found in data exchange. We also look at the complexity of common problems related to incompleteness, and generalize several results from relational and XML contexts.
- S. Abiteboul, O. Duschka. Complexity of answering queriesusing materialized views. In PODS 1998, pages 254--263. Google Scholar
Digital Library
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google Scholar
Digital Library
- S. Abiteboul, P Kanellakis, and G. Grahne. On the representation and querying of sets of possible worlds. TCS, 78(1):158--187, 1991. Google Scholar
Digital Library
- S. Abiteboul, L. Segoufin, and V. Vianu. Representing and querying XML with incomplete information. ACM TODS, 31(1):208--254, 2006. Google Scholar
Digital Library
- . Antova, C. Koch, D. Olteanu. 1010
6 worlds and beyond: efficient representationand processing of incomplete information. VLDB J. 18(5): 1021--1040 (2009). Google Scholar
Digital Library
- M. Arenas, P. Barceló, L. Libkin, F. Murlak. Relational and XML Data Exchange. Morgan & Claypool, 2010. Google Scholar
Digital Library
- P. Barceló, L. Libkin, A. Poggi, and C. Sirangelo. XML with incomplete information. J. ACM 58(1): 1--62 (2010). Google Scholar
Digital Library
- H. Bjorklund, W. Martens, and T. Schwentick. Conjunctive query containment over trees. In DBPL'07, pages 66--80. Google Scholar
Digital Library
- P. Buneman, A. Jung, A. Ohori. Using powerdomains to generalize relational databases. Theoretical Computer Science 91(1991), 23--55. Google Scholar
Digital Library
- P. Buneman, S. Davidson, A. Watters. A semantics for complex objects and approximate answers. Journal of Computer and System Sciences 43(1991), 170--218. Google Scholar
Digital Library
- A. Cali, G. Gottlob, T. Lukasiewicz. Datalog: a unified approach to ontologies and integrityconstraints. In ICDT'10, pages 14--30. Google Scholar
Digital Library
- A. Cali, D. Lembo, and R. Rosati. On the decidability and complexity of query answering over inconsistent and incomplete databases. In PODS';03, pages 260--271. Google Scholar
Digital Library
- S. Cohen and Y. Sagiv. An abstract framework for generating maximal answers toqueries. In ICDT 2005, pages 129--143. Google Scholar
Digital Library
- C. Date and H. Darwin. A Guide to the SQL Standard. Addison-Wesley, 1996.Google Scholar
Digital Library
- C. David, L. Libkin, F. Murlak. Certain answers for XML queries. In PODS 2010, pages 191--202. Google Scholar
Digital Library
- A. Deutsch, A. Nash, J. Remmel. The chase revisited. In PODS'08, pages 149--158. Google Scholar
Digital Library
- P. Erdos. Graph theory and probability. Canad. J. Math. 11 (1959), 34--38.Google Scholar
Cross Ref
- R. Fagin, Ph. Kolaitis, R. Miller, and L. Popa. Data exchange: semantics and query answering. Theoretical Computer Science, 336(1):89--124, 2005. Google Scholar
Digital Library
- J. Flum and M. Grohe. Parameterized Complexity Theory. Springer, 2006. Google Scholar
Digital Library
- G. Gottlob, C. Koch, and K. Schulz. Conjunctive queries over trees. J. ACM 53(2):238--272, 2006. Google Scholar
Digital Library
- C. Gunter. Semantics of Programming Languages. The MIT Press, 1992. Google Scholar
Digital Library
- M. Gyssens, J. Paredaens, J. Van den Bussche, D. Van Gucht. A graph-oriented object database model IEEE TKDE 6(4):572--586, 1994. Google Scholar
Digital Library
- P. Hell and J. Nesetril. phGraphs and Homomorphisms. Oxford University Press, 2004.Google Scholar
- J. Hubicka and J. Nesetril. Finite paths are universal. Order 22(1):21--40, 2005.Google Scholar
Cross Ref
- T. Imielinski and W. Lipski. Incomplete information in relational databases. J. ACM, 31(4):761--791, 1984. Google Scholar
Digital Library
- P. Kolaitis and M. Vardi. A logical approach to constraint satisfaction. In Finite Model Theory and Its Applications, Springer2007, pages 339--370.Google Scholar
Cross Ref
- G. Kuper and M. Vardi. The logical data model. ACM Trans. Database Syst. (TODS) 18(3):379--413 (1993) Google Scholar
Digital Library
- M. Lenzerini. Data integration: a theoretical perspective. In PODS'02, pages 233--246. Google Scholar
Digital Library
- M. Levene and G. Loizou. Semantics of null extended nested relations. ACM Trans. Database Systems 18 (1992), 414--459. Google Scholar
Digital Library
- L. Libkin. A semantics-based approach to design of query languages forpartial information. In Semantics in Databases, LNCS 1358, 1998, pages170--208. Google Scholar
Digital Library
- A. Ohori. Semantics of types for database objects. Theoretical Computer Science 76 (1990), 53--91. Google Scholar
Digital Library
- D. Olteanu, C. Koch, L. Antova. World-set decompositions: expressiveness and efficient algorithms. TCS 403 (2008), 265--284. Google Scholar
Digital Library
- B. Rossman. Homomorphism preservation theorems. J. ACM 55(3): (2008). Google Scholar
Digital Library
- B. Rounds. Situation-theoretic aspects of databases. In Proceedings of Conference on Situation Theory and Applications, CSLI vol. 26, 1991, pages 229--256.Google Scholar
- D. Suciu. Probabilistic databases. Encyclopedia of Database Systems, 2009, pages 2150--2155.Google Scholar
Cross Ref
- Vardi. On the integrity of databases with incomplete information. In PODS'86, pages 252--266. Google Scholar
Digital Library
- W. Wechler. Universal Algebra for Computer Scientists. Springer, 1992.Google Scholar
Cross Ref
Index Terms
Incomplete information and certain answers in general data models
Recommendations
Naïve Evaluation of Queries over Incomplete Databases
Invited Articles Issue, SIGMOD 2013, PODS 2013 and ICDT 2013The term naïve evaluation refers to evaluating queries over incomplete databases as if nulls were usual data values, that is, to using the standard database query evaluation engine. Since the semantics of query answering over incomplete databases is ...
When is naive evaluation possible?
PODS '13: Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI symposium on Principles of database systemsThe term naive evaluation refers to evaluating queries over incomplete databases as if nulls were usual data values, i.e., to using the standard database query evaluation engine. Since the semantics of query answering over incomplete databases is that ...
Certain answers for XML queries
PODS '10: Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsThe notion of certain answers arises when one queries incompletely specified databases, e.g., in data integration and exchange scenarios, or databases with missing information. While in the relational case this notion is well understood, there is no ...






Comments