skip to main content
10.1145/1376916.1376922acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Towards a theory of schema-mapping optimization

Published:09 June 2008Publication History

ABSTRACT

A schema mapping is a high-level specification that describes the relationship between two database schemas. As schema mappings constitute the essential building blocks of data exchange and data integration, an extensive investigation of the foundations of schema mappings has been carried out in recent years. Even though several different aspects of schema mappings have been explored in considerable depth, the study of schema-mapping optimization remains largely uncharted territory to date.

In this paper, we lay the foundation for the development of a theory of schema-mapping optimization. Since schema mappings are constructs that live at the logical level of information integration systems, the first step is to introduce concepts and to develop techniques for transforming schema mappings to "equivalent" ones that are more manageable from the standpoint of data exchange or of some other data interoperability task. In turn, this has to start by introducing and studying suitable notions of "equivalence" between schema mappings. To this effect, we introduce the concept of data-exchange equivalence and the concept of conjunctive-query equivalence. These two concepts of equivalence are natural relaxations of the classical notion of logical equivalence; the first captures indistinguishability for data-exchange purposes, while the second captures indistinguishability for conjunctive-query-answering purposes. Moreover, they coincide with logical equivalence on schema mappings specified by source-to-target tuple-generating dependencies (s-t tgds), but differ on richer classes of dependencies, such as second-order tuple-generating dependencies (SO tgds) and sets of s-t tgds and target tuple-generating dependencies (target tgds).

After exploring the basic properties of these three notions of equivalence between schema mappings, we focus on the following question: under what conditions is a schema mapping conjunctive-query equivalent to a schema mapping specified by a finite set of s-t tgds? We answer this question by obtaining complete characterizations for schema mappings that are specified by an SO tgd and for schema mappings that are specified by a finite set of s-t tgds and target tgds, and have terminating chase. These characterizations involve boundedness properties of the cores of universal solutions.

References

  1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Arenas and L. Libkin. XML Data Exchange: Consistency and Query Answering. In ACM Symposium on Principles of Database Systems (PODS), pages 13--24, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. A. Bernstein. Applying Model Management to Classical Meta-Data Problems. In Conference on Innovative Data Systems Research (CIDR), pages 209--220, 2003.Google ScholarGoogle Scholar
  4. P. A. Bernstein and S. Melnik. Model Management 2.0: Manipulating Richer Mappings. In ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 1--12, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Calì, D. Calvanese, G. D. Giacomo, and M. Lenzerini. Data Integration under Integrity Constraints. Inf. Syst., 29(2):147--163, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Deutsch, A. Nash, and J. Remmel. The chase revisited. In ACM Symposium on Principles of Database Systems (PODS), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Deutsch and V. Tannen. Reformulation of XML Queries and Constraints. In International Conference on Database Theory (ICDT), pages 225--241, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Fagin. Inverting Schema Mappings. In ACM Symposium on Principles of Database Systems (PODS), pages 50--59, 2006. Full version to appear, ACM Transactions on Database Systems (TODS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data Exchange: Semantics and Query Answering. Theoretical Computer Science (TCS), 336(1):89--124, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Fagin, P. G. Kolaitis, and L. Popa. Data Exchange: Getting to the Core. ACM Transactions on Database Systems (TODS), 30(1):174--210, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Fagin, P. G. Kolaitis, L. Popa, and W.-C. Tan. Composing Schema Mappings: Second-order Dependencies to the Rescue. ACM Transactions on Database Systems (TODS), 30(4):994--1055, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Fagin, P. G. Kolaitis, L. Popa, and W. C. Tan. Quasi-Inverses of Schema Mappings. In ACM Symposium on Principles of Database Systems (PODS), pages 123--132, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Gaifman, H. G. Mairson, Y. Sagiv, and M. Y. Vardi. Undecidable optimization problems for database logic programs. J. ACM, 40(3):683--713, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Gottlob. Computing Cores for Data Exchange: New Algorithms and Practical Solutions. In ACM Symposium on Principles of Database Systems (PODS), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Gottlob and A. Nash. Data exchange: Computing cores in polynomial time. In ACM Symposium on Principles of Database Systems (PODS), pages 40--49, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. M. Haas, M. A. Hernández, H. Ho, L. Popa, and M. Roth. Clio Grows Up: From Research Prototype to Industrial Tool. In ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 805--810, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. G. Kolaitis. Schema Mappings, Data Exchange, and Metadata Management. In ACM Symposium on Principles of Database Systems (PODS), pages 61--75, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Lenzerini. Data Integration: A Theoretical Perspective. In ACM Symposium on Principles of Database Systems (PODS), pages 233--246, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Madhavan and A. Y. Halevy. Composing Mappings Among Data Sources. In International Conference on Very Large Data Bases (VLDB), pages 572--583, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Melnik. Generic Model Management: Concepts and Algorithms, volume 2967 of Lecture Notes in Computer Science. Springer, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Melnik, A. Adya, and P. A. Bernstein. Compiling Mappings to Bridge Applications and Databases. In ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 461--472, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Nash, P. A. Bernstein, and S. Melnik. Composition of mappings given by embedded dependencies. ACM Transactions on Database Systems (TODS), 32(1):4, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Nash, A. Deutsch, and J. Remmel. Data exchange, data integration, and chase. Technical Report CS2006-0859, UC San Diego, 2006.Google ScholarGoogle Scholar
  24. L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernández, and R. Fagin. Translating Web Data. In International Conference on Very Large Data Bases (VLDB), pages 598--609, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and difference operators. J. ACM, 27(4):633--655, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. O. Shmueli. Equivalence of DATALOG queries is undecidable. J. Log. Program., 15(3):231--241, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards a theory of schema-mapping optimization

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!