skip to main content
research-article

Detection of corrupted schema mappings in XML data integration systems

Published:14 October 2009Publication History
Skip Abstract Section

Abstract

In modern data integration scenarios, many remote data sources are located on the Web and are accessible only through forms or Web services, and no guarantee is given about their stability. In these contexts the detection of corrupted mappings, as a consequence of a change in the source or in the target schema, is a key problem. A corrupted mapping fails in matching the target or the source schema, hence it is not able to transform data conforming to a schema S into data conforming to a schema T, nor it can be used for effective query reformulation.

This article describes a novel technique for maintaining schema mappings in XML data integration systems, based on a notion of mapping correctness relying on the denotational semantics of mappings.

Skip Supplemental Material Section

Supplemental Material

References

  1. Alexe, B., Tan, W. C., and Velegrakis, Y. 2008. Stbenchmark: Towards a benchmark for mapping systems. In Proceedings of the VLDB Endowment Archive 1, 1, 230--244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Arenas, M. and Libkin, L. 2005. Xml data exchange: Consistency and query answering. In Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Benzaken, V., Castagna, G., Colazzo, D., and Nguyen, K. 2006. Type-Based xml projection. In Proceedings of the 32nd International Conference on Very Large Database. 271--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bex, G. J., Neven, F., and den Bussche, J. V. 2004. Dtds versus xml schema: A practical study. In Proceedings of the 7th International Workshop on the Web and Database (WebDB), S. Amer-Yahia and L. Gravano Eds. 79--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Boag, S., Chamberlin, D., Fernandez, M. F., Florescu, D., Robie, J., and Siméon, J. 2007. XQuery 1.0: An XML query language. Tech. rep., World Wide Web Consortium. W3C Recommendation.Google ScholarGoogle Scholar
  6. Böhm, K., Jensen, C. S., Haas, L. M., Kersten, M. L., Larson, P., and Ooi, B. C., Eds. 2005. In Proceedings of the 31st International Conference on Very Large Data Bases. ACM.Google ScholarGoogle Scholar
  7. Bonifati, A., Mecca, G., Pappalardo, A., Raunich, S., and Summa, G. 2008. Schema mapping verification: The spicy way. In Proceedings of the International Conference on Extending Database Technology (EDBT), A. Kemper, Eds. ACM International Conference Proceeding Series, vol. 261. ACM, 85--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Buneman, P. and Pierce, B. C. 1999. Union types for semistructured data. In Proceedings of the International Conference on Database Programming Languages (DBPL), R. C. H. Connor and A. O. Mendelzon, Eds. Lecture Notes in Computer Science, vol. 1949. Springer, 184--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chiticariu, L. and Tan, W. C. 2006. Debugging schema mappings with routes. In Proceedings of the 32nd International Conference on Very Large Database. 79--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Choi, B. 2002. What are real dtds like? In Proceedings of the 7th International Workshop on the Web and Database (WebDB). 43--48.Google ScholarGoogle Scholar
  11. Colazzo, D. 2004. Path correctness for XML queries: Characterization and static type checking. Ph.D. thesis, Dipartimento di Informatica, Università di Pisa.Google ScholarGoogle Scholar
  12. Colazzo, D., Ghelli, G., Manghi, P., and Sartiani, C. 2004. Types for path correctness of XML queries. In Proceedings of the International Conference on Functional Programming (ICFP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Colazzo, D., Ghelli, G., Manghi., P., and Sartiani, C. 2006. Static analysis for path correctness of XML queries. J. Functional Program. 16, 4-5, 621--661. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Colazzo, D. and Sartiani, C. 2005. Typechecking queries for maintaining schema mappings in XML P2P databases. In Proceedings of the 3th Workshop on Programming Language Technologies for XML (Plan-X) (in conjunction with POPL'05).Google ScholarGoogle Scholar
  15. Cosmo, R. D., Pottier, F., and Rémy, D. 2005. Subtyping recursive types modulo associative commutative products. In Proceedings of the International Conference on Typed Lamda Calculi and Applications (TLCA), P. Urzyczyn, Ed. Lecture Notes in Computer Science, vol. 3461. Springer, 179--193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dayal, U., Whang, K.-Y., Lomet, D. B., Alonso, G., Lohman, G. M., Kersten, M. L., Cha, S. K., and Kim, Y.-K., Eds. 2006. Proceedings of the 32nd International Conference on Very Large Data Bases. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. den Bussche, J. V. and Vianu, V., Eds. 2001. In Proceedings of the 8th International Conference on Database Theory (ICDT 2001). Lecture Notes in Computer Science, vol. 1973. Springer.Google ScholarGoogle Scholar
  18. Draper, D., Fankhauser, P., Fernandez, M., Malhotra, A., Rose, K., Rys, M., Siméon, J., and Wadler, P. 2007. XQuery 1.0 and XPath 2.0 formal semantics. Tech. rep., World Wide Web Consortium. W3C Recommendation.Google ScholarGoogle Scholar
  19. Fernandez, M. F., Siméon, J., and Wadler, P. 2001. A semi-monad for semi-structured data. In Proceedings of the 8th International Conference on Database Theory (ICDT). 263--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Freytag, J. C., Lockemann, P. C., Abiteboul, S., Carey, M. J., Selinger, P. G., and Heuer, A., Eds. 2003. In Proceedings of the 29th International Conference on Very Large Data Bases. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Friedman, M., Levy, A. Y., and Millstein, T. D. 1999. Navigational plans for data integration. In Proceedings of the CEUR Workshop on Intelligent Information Integration. CEUR, vol. 23.Google ScholarGoogle Scholar
  22. Fuxman, A., Kolaitis, P. G., Miller, R. J., and Tan, W. C. 2005. Peer data exchange. In Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 160--171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Goldberg, A. V. 1998. Recent developments in maximum flow algorithms. In Proceedings of the Scandinavian Workshop on Algorithm Theory (SWAT), S. Arnborg and L. Ivansson, Eds. Lecture Notes in Computer Science, vol. 1432. Springer, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Halevy, A. Y., Ives, Z. G., Mork, P., and Tatarinov, I. 2003. Piazza: Data management infrastructure for semantic Web applications. In Proceedings of the 12th International World Wide Web Conference (WWW2003, Budapest). ACM, 556--567. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hernández, M. A., Ho, H., Popa, L., Fuxman, A., Miller, R. J., Fukuda, T., and Papotti, P. 2007. Creating nested mappings with Clio. In Proceedings of the International Conference on Data Engineering (ICDE). IEEE, 1487--1488.Google ScholarGoogle Scholar
  26. Hosoya, H. and Pierce, B. C. 2003. Xduce: A statically typed xml processing language. ACM Trans. Internet Techn. 3, 2, 117--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Huynh, D. T. 1985. The complexity of equivalence problems for commutative grammars. Inform. Control 66, 1/2, 103--121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kuper, G. M. and Siméon, J. 2001. Subsumption for xml types. In Proceedings of the 8th International Conference on Database Theory (ICDT). 331--345. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kushmerick, N. 2000. Wrapper verification. World Wide Web 3, 2, 79--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Lerman, K., Minton, S., and Knoblock, C. A. 2003. Wrapper maintenance: A machine learning approach. J. Artif. Intell. Res. 18, 149--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Li, C., Ed. 2005. In Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM.Google ScholarGoogle Scholar
  32. Madhavan, J. and Halevy, A. Y. 2003. Composing mappings among data sources. In Proceedings of the 29th International Conference on Very Large Databases. 572--583. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Marian, A. and Siméon, J. 2003. Projecting xml documents. In Proceedings of the 29th International Conference on Very Large Databases. 213--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Mayer, A. J. and Stockmeyer, L. J. 1994. Word problems-This time with interleaving. Inform. Comput. 115, 2, 293--311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. McCann, R., AlShebli, B. K., Le, Q., Nguyen, H., Vu, L., and Doan, A. 2005. Mapping maintenance for data integration systems. In Proceedings of the 31st International Conference on Very Large Databases. 1018--1030. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Melnik, S., Rahm, E., and Bernstein, P. A. 2003. Rondo: A programming platform for generic model management. In SIGMOD Conference, A. Y. Halevy, Z. G. Ives, and A. Doan, Eds. ACM, 193--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Popa, L., Velegrakis, Y., Miller, R. J., Hernández, M. A., and Fagin, R. 2002. Translating Web data. In Proceedings of the Conference on Very Large Databases (VLDB). Morgan Kaufmann, 598--609. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Tatarinov, I. 2004. Semantic data sharing with a peer data management system. Ph.D. thesis, University of Washington. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Tatarinov, I. and Halevy, A. Y. 2004. Efficient query reformulation in peer-data management systems. In Proceedings of the SIGMOD Conference. 539--550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ullman, J. D. 1988. Principles of Database and Knowledge-Base Systems, Volume I. Computer Science Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Ullman, J. D. 1989. Principles of Database and Knowledge-Base Systems, Volume II. Computer Science Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Velegrakis, Y., Miller, R. J., and Popa, L. 2004. Preserving mapping consistency under schema changes. VLDB J. 13, 3, 274--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Yu, C. and Popa, L. 2005. Semantic adaptation of schema mappings when schemas evolve. In Proceedings of the 31st International Conference on Very Large Databases. 1006--1017. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Detection of corrupted schema mappings in XML data integration systems

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!