Abstract
In modern data integration scenarios, many remote data sources are located on the Web and are accessible only through forms or Web services, and no guarantee is given about their stability. In these contexts the detection of corrupted mappings, as a consequence of a change in the source or in the target schema, is a key problem. A corrupted mapping fails in matching the target or the source schema, hence it is not able to transform data conforming to a schema S into data conforming to a schema T, nor it can be used for effective query reformulation.
This article describes a novel technique for maintaining schema mappings in XML data integration systems, based on a notion of mapping correctness relying on the denotational semantics of mappings.
Supplemental Material
Available for Download
Online appendix to detection of corrupted schema mappings in XML data integration systems. The appendix supports the information on article 14.
- Alexe, B., Tan, W. C., and Velegrakis, Y. 2008. Stbenchmark: Towards a benchmark for mapping systems. In Proceedings of the VLDB Endowment Archive 1, 1, 230--244. Google Scholar
Digital Library
- Arenas, M. and Libkin, L. 2005. Xml data exchange: Consistency and query answering. In Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 13--24. Google Scholar
Digital Library
- Benzaken, V., Castagna, G., Colazzo, D., and Nguyen, K. 2006. Type-Based xml projection. In Proceedings of the 32nd International Conference on Very Large Database. 271--282. Google Scholar
Digital Library
- Bex, G. J., Neven, F., and den Bussche, J. V. 2004. Dtds versus xml schema: A practical study. In Proceedings of the 7th International Workshop on the Web and Database (WebDB), S. Amer-Yahia and L. Gravano Eds. 79--84. Google Scholar
Digital Library
- Boag, S., Chamberlin, D., Fernandez, M. F., Florescu, D., Robie, J., and Siméon, J. 2007. XQuery 1.0: An XML query language. Tech. rep., World Wide Web Consortium. W3C Recommendation.Google Scholar
- Böhm, K., Jensen, C. S., Haas, L. M., Kersten, M. L., Larson, P., and Ooi, B. C., Eds. 2005. In Proceedings of the 31st International Conference on Very Large Data Bases. ACM.Google Scholar
- Bonifati, A., Mecca, G., Pappalardo, A., Raunich, S., and Summa, G. 2008. Schema mapping verification: The spicy way. In Proceedings of the International Conference on Extending Database Technology (EDBT), A. Kemper, Eds. ACM International Conference Proceeding Series, vol. 261. ACM, 85--96. Google Scholar
Digital Library
- Buneman, P. and Pierce, B. C. 1999. Union types for semistructured data. In Proceedings of the International Conference on Database Programming Languages (DBPL), R. C. H. Connor and A. O. Mendelzon, Eds. Lecture Notes in Computer Science, vol. 1949. Springer, 184--207. Google Scholar
Digital Library
- Chiticariu, L. and Tan, W. C. 2006. Debugging schema mappings with routes. In Proceedings of the 32nd International Conference on Very Large Database. 79--90. Google Scholar
Digital Library
- Choi, B. 2002. What are real dtds like? In Proceedings of the 7th International Workshop on the Web and Database (WebDB). 43--48.Google Scholar
- Colazzo, D. 2004. Path correctness for XML queries: Characterization and static type checking. Ph.D. thesis, Dipartimento di Informatica, Università di Pisa.Google Scholar
- Colazzo, D., Ghelli, G., Manghi, P., and Sartiani, C. 2004. Types for path correctness of XML queries. In Proceedings of the International Conference on Functional Programming (ICFP). Google Scholar
Digital Library
- Colazzo, D., Ghelli, G., Manghi., P., and Sartiani, C. 2006. Static analysis for path correctness of XML queries. J. Functional Program. 16, 4-5, 621--661. Google Scholar
Digital Library
- Colazzo, D. and Sartiani, C. 2005. Typechecking queries for maintaining schema mappings in XML P2P databases. In Proceedings of the 3th Workshop on Programming Language Technologies for XML (Plan-X) (in conjunction with POPL'05).Google Scholar
- Cosmo, R. D., Pottier, F., and Rémy, D. 2005. Subtyping recursive types modulo associative commutative products. In Proceedings of the International Conference on Typed Lamda Calculi and Applications (TLCA), P. Urzyczyn, Ed. Lecture Notes in Computer Science, vol. 3461. Springer, 179--193. Google Scholar
Digital Library
- Dayal, U., Whang, K.-Y., Lomet, D. B., Alonso, G., Lohman, G. M., Kersten, M. L., Cha, S. K., and Kim, Y.-K., Eds. 2006. Proceedings of the 32nd International Conference on Very Large Data Bases. ACM. Google Scholar
Digital Library
- den Bussche, J. V. and Vianu, V., Eds. 2001. In Proceedings of the 8th International Conference on Database Theory (ICDT 2001). Lecture Notes in Computer Science, vol. 1973. Springer.Google Scholar
- Draper, D., Fankhauser, P., Fernandez, M., Malhotra, A., Rose, K., Rys, M., Siméon, J., and Wadler, P. 2007. XQuery 1.0 and XPath 2.0 formal semantics. Tech. rep., World Wide Web Consortium. W3C Recommendation.Google Scholar
- Fernandez, M. F., Siméon, J., and Wadler, P. 2001. A semi-monad for semi-structured data. In Proceedings of the 8th International Conference on Database Theory (ICDT). 263--300. Google Scholar
Digital Library
- Freytag, J. C., Lockemann, P. C., Abiteboul, S., Carey, M. J., Selinger, P. G., and Heuer, A., Eds. 2003. In Proceedings of the 29th International Conference on Very Large Data Bases. Morgan Kaufmann. Google Scholar
Digital Library
- Friedman, M., Levy, A. Y., and Millstein, T. D. 1999. Navigational plans for data integration. In Proceedings of the CEUR Workshop on Intelligent Information Integration. CEUR, vol. 23.Google Scholar
- Fuxman, A., Kolaitis, P. G., Miller, R. J., and Tan, W. C. 2005. Peer data exchange. In Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 160--171. Google Scholar
Digital Library
- Goldberg, A. V. 1998. Recent developments in maximum flow algorithms. In Proceedings of the Scandinavian Workshop on Algorithm Theory (SWAT), S. Arnborg and L. Ivansson, Eds. Lecture Notes in Computer Science, vol. 1432. Springer, 1--10. Google Scholar
Digital Library
- Halevy, A. Y., Ives, Z. G., Mork, P., and Tatarinov, I. 2003. Piazza: Data management infrastructure for semantic Web applications. In Proceedings of the 12th International World Wide Web Conference (WWW2003, Budapest). ACM, 556--567. Google Scholar
Digital Library
- Hernández, M. A., Ho, H., Popa, L., Fuxman, A., Miller, R. J., Fukuda, T., and Papotti, P. 2007. Creating nested mappings with Clio. In Proceedings of the International Conference on Data Engineering (ICDE). IEEE, 1487--1488.Google Scholar
- Hosoya, H. and Pierce, B. C. 2003. Xduce: A statically typed xml processing language. ACM Trans. Internet Techn. 3, 2, 117--148. Google Scholar
Digital Library
- Huynh, D. T. 1985. The complexity of equivalence problems for commutative grammars. Inform. Control 66, 1/2, 103--121. Google Scholar
Digital Library
- Kuper, G. M. and Siméon, J. 2001. Subsumption for xml types. In Proceedings of the 8th International Conference on Database Theory (ICDT). 331--345. Google Scholar
Digital Library
- Kushmerick, N. 2000. Wrapper verification. World Wide Web 3, 2, 79--94. Google Scholar
Digital Library
- Lerman, K., Minton, S., and Knoblock, C. A. 2003. Wrapper maintenance: A machine learning approach. J. Artif. Intell. Res. 18, 149--181. Google Scholar
Digital Library
- Li, C., Ed. 2005. In Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM.Google Scholar
- Madhavan, J. and Halevy, A. Y. 2003. Composing mappings among data sources. In Proceedings of the 29th International Conference on Very Large Databases. 572--583. Google Scholar
Digital Library
- Marian, A. and Siméon, J. 2003. Projecting xml documents. In Proceedings of the 29th International Conference on Very Large Databases. 213--224. Google Scholar
Digital Library
- Mayer, A. J. and Stockmeyer, L. J. 1994. Word problems-This time with interleaving. Inform. Comput. 115, 2, 293--311. Google Scholar
Digital Library
- McCann, R., AlShebli, B. K., Le, Q., Nguyen, H., Vu, L., and Doan, A. 2005. Mapping maintenance for data integration systems. In Proceedings of the 31st International Conference on Very Large Databases. 1018--1030. Google Scholar
Digital Library
- Melnik, S., Rahm, E., and Bernstein, P. A. 2003. Rondo: A programming platform for generic model management. In SIGMOD Conference, A. Y. Halevy, Z. G. Ives, and A. Doan, Eds. ACM, 193--204. Google Scholar
Digital Library
- Popa, L., Velegrakis, Y., Miller, R. J., Hernández, M. A., and Fagin, R. 2002. Translating Web data. In Proceedings of the Conference on Very Large Databases (VLDB). Morgan Kaufmann, 598--609. Google Scholar
Digital Library
- Tatarinov, I. 2004. Semantic data sharing with a peer data management system. Ph.D. thesis, University of Washington. Google Scholar
Digital Library
- Tatarinov, I. and Halevy, A. Y. 2004. Efficient query reformulation in peer-data management systems. In Proceedings of the SIGMOD Conference. 539--550. Google Scholar
Digital Library
- Ullman, J. D. 1988. Principles of Database and Knowledge-Base Systems, Volume I. Computer Science Press. Google Scholar
Digital Library
- Ullman, J. D. 1989. Principles of Database and Knowledge-Base Systems, Volume II. Computer Science Press. Google Scholar
Digital Library
- Velegrakis, Y., Miller, R. J., and Popa, L. 2004. Preserving mapping consistency under schema changes. VLDB J. 13, 3, 274--293. Google Scholar
Digital Library
- Yu, C. and Popa, L. 2005. Semantic adaptation of schema mappings when schemas evolve. In Proceedings of the 31st International Conference on Very Large Databases. 1006--1017. Google Scholar
Digital Library
Index Terms
Detection of corrupted schema mappings in XML data integration systems
Recommendations
XML Schema Mappings: Data Exchange and Metadata Management
Relational schema mappings have been extensively studied in connection with data integration and exchange problems, but mappings between XML schemas have not received the same amount of attention. Our goal is to develop a theory of expressive XML schema ...
Quasi-inverses of schema mappings
Schema mappings are high-level specifications that describe the relationship between two database schemas. Two operators on schema mappings, namely the composition operator and the inverse operator, are regarded as especially important. Progress on the ...
Composing schema mappings: Second-order dependencies to the rescue
Special Issue: SIGMOD/PODS 2004A schema mapping is a specification that describes how data structured under one schema (the source schema) is to be transformed into data structured under a different schema (the target schema). A fundamental problem is composing schema mappings: given ...






Comments