ABSTRACT
A distributed XML document is an XML document that spans several machines or Web repositories. We assume that a distribution design of the document tree is given, providing an XML tree some of whose leaves are "docking points", to which XML subtrees can be attached. These subtrees may be provided and controlled by peers at remote locations, or may correspond to the result of function calls, e.g., Web services. If a global type τ, e.g. a DTD, is specified for a distributed document T, it would be most desirable to be able to break this type into a collection of local types, called a local typing, such that the document satisfies τ if and only if each peer (or function) satisfies its local type. In this paper we lay out the fundamentals of a theory of local typing and provide formal definitions of three main variants of locality: local typing, maximal local typing, and perfect typing, the latter being the most desirable. We study the following relevant decision problems: (i) given a typing for a design, determine whether it is local, maximal local, or perfect; (ii) given a design, establish whether a (maximal) local, or perfect typing does exist. For some of these problems we provide tight complexity bounds (polynomial space), while for the others we show exponential upper bounds. A main contribution is a polynomial-space algorithm for computing a perfect typing in this context, if it exists.
- S. Abiteboul, O. Benjelloun, and T. Milo. The Active XML project: an overview. The VLDB Journal, 17(5):1019--1040, 2008. Google Scholar
Digital Library
- S. Abiteboul, A. Bonifati, G. Cobéna, I. Manolescu, and T. Milo. Dynamic XML documents with distribution and replication. In SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 527--538, 2003. Google Scholar
Digital Library
- S. Abiteboul, I. Manolescu, and E. Taropa. A framework for distributed XML data management. In Y. E. Ioannidis, M. H. Scholl, J. W. Schmidt, F. Matthes, M. Hatzopoulos, K. Bohm, A. Kemper, T. Grust, and C. Bohm, editors, EDBT, volume 3896 of Lecture Notes in Computer Science, pages 1049--1058. Springer, 2006. Google Scholar
Digital Library
- S. Abiteboul, T. Milo, and O. Benjelloun. Regular rewriting of Active XML and unambiguity. In Symposium on Principles of database systems, 2005. Google Scholar
Digital Library
- J.-M. Bremer and M. Gertz. On distributing XML repositories. In V. Christophides and J. Freire, editors, WebDB, pages 73--78, 2003.Google Scholar
- S. Ceri, P. Fraternali, and A. Bongio. Web modeling language (WebML): a modeling language for designing web sites. Comput. Netw., 33(1-6):137--157, 2000. Google Scholar
Digital Library
- S. Ceri, B. Pernici, and G. Wiederhold. An overview of research in the design of distributed databases. IEEE Database Eng. Bull., 7(4):46--51, 1984.Google Scholar
- J. Clark and M. Murata. RELAX NG Specification. OASIS, 1 edition, December 2001.Google Scholar
- G. Ghelli, D. Colazzo, and C. Sartiani. Efficient inclusion for a class of XML types with interleaving and counting. In M. Arenas and M. I. Schwartzbach, editors, DBPL, volume 4797 of Lecture Notes in Computer Science, pages 231--245. Springer, 2007. Google Scholar
Digital Library
- P. Grosso and D. Veillard. XML fragment interchange. Internet Publication, Feb 2001. W3C Candidate Recommendation 12 February 2001.Google Scholar
- C. Hagenah and A. Muscholl. Computing epsilon-free NFA from regular expressions in o(n log2(n)) time. In MFCS '98: Proceedings of the 23rd International Symposium on Mathematical Foundations of Computer Science, pages 277--285, London, UK, 1998. Springer-Verlag. Google Scholar
Digital Library
- L. O. Hernandez and M. Pegah. WebDAV: what it is, what it does, why you need it. In SIGUCCS '03: Proceedings of the 31st annual ACM SIGUCCS conference on User services, pages 249--254, New York, NY, USA, 2003. ACM. Google Scholar
Digital Library
- M. Holzer and M. Kutrib. State complexity of basic operations on nondeterministic finite automata. In J.-M. Champarnaud and D. Maurel, editors, CIAA, volume 2608 of Lecture Notes in Computer Science, pages 148--157. Springer, 2002. Google Scholar
Digital Library
- J. Hromkovic, S. Seibert, and T. Wilke. Translating regular expressions into small epsilon-free nondeterministic finite automata. In STACS '97: Proceedings of the 14th Annual Symposium on Theoretical Aspects of Computer Science, pages 55--66, London, UK, 1997. Springer-Verlag. Google Scholar
Digital Library
- H. V. Jagadish, L. V. S. Lakshmanan, T. Milo, D. Srivastava, and D. Vista. Querying network directories. SIGMOD Rec., 28(2):133--144, 1999. Google Scholar
Digital Library
- W. Martens, F. Neven, and T. Schwentick. Complexity of decision problems for simple regular expressions. In J. Fiala, V. Koubek, and J. Kratochvil, editors, MFCS, volume 3153 of Lecture Notes in Computer Science, pages 889--900. Springer, 2004.Google Scholar
- M. T. Ozsu and P. Valduriez. Distributed database systems: Where are we now? Computer, 24(8):68--78, 1991. Google Scholar
Digital Library
- Y. Papakonstantinou and V. Vianu. DTD inference for views of XML data. In PODS '00: Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 35--46, New York, NY, USA, 2000. ACM. Google Scholar
Digital Library
- W. J. Savitch. Relationships between nondeterministic and deterministic tape complexities. J. Comput. Syst. Sci., 4(2):177--192, 1970.Google Scholar
Digital Library
- H. Seidl. Deciding equivalence of finite tree automata. SIAM J. Comput., 19(3):424--437, 1990. Google Scholar
Digital Library
- L. J. Stockmeyer and A. R. Meyer. Word problems requiring exponential time(preliminary report). In STOC '73: Proceedings of the fifth annual ACM symposium on Theory of computing, pages 1--9, New York, NY, USA, 1973. ACM. Google Scholar
Digital Library
- D. Suciu. Typechecking for semistructured data. In DBPL '01: Revised Papers from the 8th International Workshop on Database Programming Languages, pages 1--20, London, UK, 2002. Springer-Verlag. Google Scholar
Digital Library
- H. Thompson, D. Beech, M. Maloney, and N. Mendelsohn. XML schema part 1: Structures second edition. Internet Publication, Oct 2004. Recommendation, World Wide Web Consortium, Boston, Tokyo, Sophia Antipolis.Google Scholar
- M. Veanes. On computational complexity of basic decision problems of finite tree automata. Technical Report 133, Uppsala Programming Methodology and Artificial Intelligence Laboratory, Sweden, Jan. 1997.Google Scholar
Index Terms
Distributed XML design
Recommendations
Distributed XML design
A distributed XML document is an XML document that spans several machines. We assume that a distribution design of the document tree is given, consisting of an XML kernel-documentT"["f"""1","...","f"""n"] where some leaves are ''docking points'' for ...
The complexity of XPath query evaluation and XML typing
We study the complexity of two central XML processing problems. The first is XPath 1.0 query processing, which has been shown to be in PTIME in previous work. We prove that both the data complexity and the query complexity of XPath 1.0 fall into lower (...
Graph transformation to infer schemata from XML documents
SAC '05: Proceedings of the 2005 ACM symposium on Applied computingSemi-structured data are characterized by the lack of a predefined schema. This heterogeneity simplifies the management of such data, but analysis and queries become more difficult and demand for schemata that describe these data. Super-imposed ...






Comments