ABSTRACT
Abiteboul et al. initiated the systematic study of distributed XML documents consisting of several logical parts, possibly located on different machines. The physical distribution of such documents immediately raises the following question: how can a global schema for the distributed document be broken up into local schemas for the different logical parts? The desired set of local schemas should guarantee that, if each logical part satisfies its local schema, then the distributed document satisfies the global schema.
Abiteboul et al. proposed three levels of desirability for local schemas: local typing, maximal local typing, and perfect local typing. Immediate algorithmic questions are: (i) given a typing, determine whether it is local, maximal local, or perfect, and (ii) given a document and a schema, establish whether a (maximal) local or perfect typing exists. This paper improves the open complexity results in their work and initiates the study of (i) and (ii) for schema restrictions arising from the current standards: DTDs and XML Schemas with deterministic content models. The most striking result is that these restrictions yield tractable complexities for the perfect typing problem.
Furthermore, an open problem in Formal Language Theory is settled: deciding language primality for deterministic finite automata is pspace-complete.
- S. Abiteboul, G. Gottlob, and M. Manna. Distributed XML design. In ACM PODS, pages 247--258, 2009. Google Scholar
Digital Library
- S.V. Avgustinovich and A. Frid. A unique decomposition theorem for factorial languages. Int. J. of Algebra and Comput., 15:149--160, 2005.Google Scholar
Cross Ref
- Sebastian Bala. Regular language matching and other decidable cases of the satisfiability problem for constraints between regular open terms. In STACS, pages 596--607, 2004.Google Scholar
Cross Ref
- G. J. Bex, W. Gelade, W. Martens, and F. Neven. Simplifying XML Schema: effortless handling of nondeterministic regular expressions. In ACM SIGMOD, pages 731--744, 2009. Google Scholar
Digital Library
- T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, and F. Yergeau. Extensible Markup Language XML 1.0 (fifth edition). Technical report, World Wide Web Consortium (W3C), November 2008. W3C Recommendation, http://www.w3.org/TR/2008/REC-xml-20081126/.Google Scholar
- A. Brüggemann-Klein and D. Wood. One-unambiguous regular languages. Inf. and Comput., 142(2):182--206, 1998. Google Scholar
Digital Library
- D. Calvanese, G. De Giacomo, M. Lenzerini, and M.Y. Vardi. Rewriting of regular expressions and regular path queries. J. Comp. Syst. Sc., 64(3):443--465, 2002.Google Scholar
Digital Library
- J. Clark and M. Murata. Relax NG specification. http://www.relaxng.org/spec-20011203.html, December 2001.Google Scholar
- J.H. Conway. Regular Algebra and Finite Machines. Chapman and Hall, 1971.Google Scholar
- J. Czyzowicz, W. Fraczak, A. Pelc, and W. Rytter. Linear-time prime decompositions of regular prefix codes. Int. J. Found. Comp. Sc., 14:1019--1031, 2003.Google Scholar
Cross Ref
- D. Fallside and P. Walmsley. XML Schema Part 0: Primer (second edition). Technical report, World Wide Web Consortium, October 2004. http://www.w3.org/TR/2004/REC-xmlschema-0-20041028/.Google Scholar
- S. Gao, C. M. Sperberg-McQueen, H.S. Thompson, N. Mendelsohn, D. Beech, and M. Maloney. W3C XML Schema Definition Language (XSD) 1.1 part 1: Structures. Technical report, World Wide Web Consortium, April 2009. W3C Recommendation, http://www.w3.org/TR/2009/CR-xmlschema11-1-20090430/.Google Scholar
- Y.-S. Han, K. Salomaa, and D. Wood. Prime decompositions of regular languages. In DLT, pages 145--155, 2006. Google Scholar
Digital Library
- T. Jiang and B. Ravikumar. Minimal NFA problems are hard. Siam J. Comp., 22(6):1117--1141, 1993. Google Scholar
Digital Library
- M. Kunc. What do we know about language equations? In DLT, pages 23--27, 2007. Google Scholar
Digital Library
- W. Martens, F. Neven, and T. Schwentick. Complexity of decision problems for XML schemas and chain regular expressions. Siam J. Comp., 39(4):1486--1530, 2009.Google Scholar
Digital Library
- Y. Papakonstantinou and V. Vianu. DTD inference for views of XML data. In ACM PODS, pages 35--46, 2000. Google Scholar
Digital Library
- A. Salomaa, K. Salomaa, and S. Yu. Length codes, products of languages and primality. In LATA, pages 476--486, 2008. Google Scholar
Digital Library
- A. Salomaa and S. Yu. On the decomposition of finite languages. In DLT, pages 22--31, 1999.Google Scholar
- K. Salomaa. Language decompositions, primality, and trajectory-based operations. In CIAA, pages 17--22, 2008. Google Scholar
Digital Library
- P. van Emde Boas. The convenience of tilings. In A. Sorbi, editor, Complexity, Logic and Recursion Theory, volume 187 of Lecture Notes in Pure and Applied Mathematics, pages 331--363. Marcel Dekker Inc., 1997.Google Scholar
- M. Y. Vardi. An automata-theoretic approach to linear temporal logic. In BANFF, pages 238--266, 1995. Google Scholar
Digital Library
- W. Wieczorek. An algorithm for the decomposition of finite languages. Logic J. of the IGPL, 2009. Appeared on-line August 8, 2009.Google Scholar
- S. Yu. Regular languages. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 1, chapter 2. Springer, 1997. Google Scholar
Digital Library
Index Terms
Schema design for XML repositories: complexity and tractability
Recommendations
Schema extraction from XML collections
JCDL '02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital librariesXML Schema language has been proposed to replace Document Type Definitions (DTDs) as schema mechanism for XML data. This language consistently extends grammar-based constructions with constraint- and pattern-based ones and have a higher expressive power ...
XML-based XML schema access
WWW '07: Proceedings of the 16th international conference on World Wide WebXML Schema's abstract data model consists of components, which are the structures that eventually define a schema as a whole. XML Schema's XML syntax, on the other hand, is not a direct representation of the schema components, and it proves to be ...
Schema Mediation for Heterogeneous XML Schema Sources
WAINA '09: Proceedings of the 2009 International Conference on Advanced Information Networking and Applications WorkshopsDue to the increasingly widespread use of XML, many XML-related applications require the service of schema mediation, which is to find semantically similar elements from two or more schema sources. Current approaches to schema mediation require much ...






Comments