ABSTRACT
A schema mapping is a high-level specification that describes the relationship between two database schemas. As schema mappings constitute the essential building blocks of data exchange and data integration, an extensive investigation of the foundations of schema mappings has been carried out in recent years. Even though several different aspects of schema mappings have been explored in considerable depth, the study of schema-mapping optimization remains largely uncharted territory to date.
In this paper, we lay the foundation for the development of a theory of schema-mapping optimization. Since schema mappings are constructs that live at the logical level of information integration systems, the first step is to introduce concepts and to develop techniques for transforming schema mappings to "equivalent" ones that are more manageable from the standpoint of data exchange or of some other data interoperability task. In turn, this has to start by introducing and studying suitable notions of "equivalence" between schema mappings. To this effect, we introduce the concept of data-exchange equivalence and the concept of conjunctive-query equivalence. These two concepts of equivalence are natural relaxations of the classical notion of logical equivalence; the first captures indistinguishability for data-exchange purposes, while the second captures indistinguishability for conjunctive-query-answering purposes. Moreover, they coincide with logical equivalence on schema mappings specified by source-to-target tuple-generating dependencies (s-t tgds), but differ on richer classes of dependencies, such as second-order tuple-generating dependencies (SO tgds) and sets of s-t tgds and target tuple-generating dependencies (target tgds).
After exploring the basic properties of these three notions of equivalence between schema mappings, we focus on the following question: under what conditions is a schema mapping conjunctive-query equivalent to a schema mapping specified by a finite set of s-t tgds? We answer this question by obtaining complete characterizations for schema mappings that are specified by an SO tgd and for schema mappings that are specified by a finite set of s-t tgds and target tgds, and have terminating chase. These characterizations involve boundedness properties of the cores of universal solutions.
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google Scholar
Digital Library
- M. Arenas and L. Libkin. XML Data Exchange: Consistency and Query Answering. In ACM Symposium on Principles of Database Systems (PODS), pages 13--24, 2005. Google Scholar
Digital Library
- P. A. Bernstein. Applying Model Management to Classical Meta-Data Problems. In Conference on Innovative Data Systems Research (CIDR), pages 209--220, 2003.Google Scholar
- P. A. Bernstein and S. Melnik. Model Management 2.0: Manipulating Richer Mappings. In ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 1--12, 2007. Google Scholar
Digital Library
- A. Calì, D. Calvanese, G. D. Giacomo, and M. Lenzerini. Data Integration under Integrity Constraints. Inf. Syst., 29(2):147--163, 2004. Google Scholar
Digital Library
- A. Deutsch, A. Nash, and J. Remmel. The chase revisited. In ACM Symposium on Principles of Database Systems (PODS), 2008. Google Scholar
Digital Library
- A. Deutsch and V. Tannen. Reformulation of XML Queries and Constraints. In International Conference on Database Theory (ICDT), pages 225--241, 2003. Google Scholar
Digital Library
- R. Fagin. Inverting Schema Mappings. In ACM Symposium on Principles of Database Systems (PODS), pages 50--59, 2006. Full version to appear, ACM Transactions on Database Systems (TODS). Google Scholar
Digital Library
- R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data Exchange: Semantics and Query Answering. Theoretical Computer Science (TCS), 336(1):89--124, 2005. Google Scholar
Digital Library
- R. Fagin, P. G. Kolaitis, and L. Popa. Data Exchange: Getting to the Core. ACM Transactions on Database Systems (TODS), 30(1):174--210, 2005. Google Scholar
Digital Library
- R. Fagin, P. G. Kolaitis, L. Popa, and W.-C. Tan. Composing Schema Mappings: Second-order Dependencies to the Rescue. ACM Transactions on Database Systems (TODS), 30(4):994--1055, 2005. Google Scholar
Digital Library
- R. Fagin, P. G. Kolaitis, L. Popa, and W. C. Tan. Quasi-Inverses of Schema Mappings. In ACM Symposium on Principles of Database Systems (PODS), pages 123--132, 2007. Google Scholar
Digital Library
- H. Gaifman, H. G. Mairson, Y. Sagiv, and M. Y. Vardi. Undecidable optimization problems for database logic programs. J. ACM, 40(3):683--713, 1993. Google Scholar
Digital Library
- G. Gottlob. Computing Cores for Data Exchange: New Algorithms and Practical Solutions. In ACM Symposium on Principles of Database Systems (PODS), 2005. Google Scholar
Digital Library
- G. Gottlob and A. Nash. Data exchange: Computing cores in polynomial time. In ACM Symposium on Principles of Database Systems (PODS), pages 40--49, 2006. Google Scholar
Digital Library
- L. M. Haas, M. A. Hernández, H. Ho, L. Popa, and M. Roth. Clio Grows Up: From Research Prototype to Industrial Tool. In ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 805--810, 2005. Google Scholar
Digital Library
- P. G. Kolaitis. Schema Mappings, Data Exchange, and Metadata Management. In ACM Symposium on Principles of Database Systems (PODS), pages 61--75, 2005. Google Scholar
Digital Library
- M. Lenzerini. Data Integration: A Theoretical Perspective. In ACM Symposium on Principles of Database Systems (PODS), pages 233--246, 2002. Google Scholar
Digital Library
- J. Madhavan and A. Y. Halevy. Composing Mappings Among Data Sources. In International Conference on Very Large Data Bases (VLDB), pages 572--583, 2003. Google Scholar
Digital Library
- S. Melnik. Generic Model Management: Concepts and Algorithms, volume 2967 of Lecture Notes in Computer Science. Springer, 2004. Google Scholar
Digital Library
- S. Melnik, A. Adya, and P. A. Bernstein. Compiling Mappings to Bridge Applications and Databases. In ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 461--472, 2007. Google Scholar
Digital Library
- A. Nash, P. A. Bernstein, and S. Melnik. Composition of mappings given by embedded dependencies. ACM Transactions on Database Systems (TODS), 32(1):4, 2007. Google Scholar
Digital Library
- A. Nash, A. Deutsch, and J. Remmel. Data exchange, data integration, and chase. Technical Report CS2006-0859, UC San Diego, 2006.Google Scholar
- L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernández, and R. Fagin. Translating Web Data. In International Conference on Very Large Data Bases (VLDB), pages 598--609, 2002. Google Scholar
Digital Library
- Y. Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and difference operators. J. ACM, 27(4):633--655, 1980. Google Scholar
Digital Library
- O. Shmueli. Equivalence of DATALOG queries is undecidable. J. Log. Program., 15(3):231--241, 1993. Google Scholar
Digital Library
Index Terms
Towards a theory of schema-mapping optimization
Recommendations
Structural characterizations of schema-mapping languages
ICDT '09: Proceedings of the 12th International Conference on Database TheorySchema mappings are declarative specifications that describe the relationship between two database schemas. In recent years, there has been an extensive study of schema mappings and of their applications to several different data inter-operability tasks,...
Quasi-inverses of schema mappings
Schema mappings are high-level specifications that describe the relationship between two database schemas. Two operators on schema mappings, namely the composition operator and the inverse operator, are regarded as especially important. Progress on the ...
Quasi-inverses of schema mappings
PODS '07: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsSchema mappings are high-level specifications that describe the relationship between two database schemas. Two operators on schema mappings, namely the composition operator and the inverse operator, are regarded as especially important. Progress on the ...






Comments