ABSTRACT
Constraints are important not just for maintaining data integrity, but also because they capture natural probabilistic dependencies among data items. A probabilistic XML database (PXDB) is the probability sub-space comprising the instances of a p-document that satisfy a set of constraints. In contrast to existing models that can express probabilistic dependencies, it is shown that query evaluation is tractable in PXDBs. The problems of sampling and determining well-definedness (i.e., whether the above subspace is nonempty) are also tractable. Furthermore, queries and constraints can include the aggregate functions count, max, min and ratio. Finally, this approach can be easily extended to allow a probabilistic interpretation of constraints.
- S. Abiteboul and P. Senellart. Querying and updating probabilistic information in XML. In EDBT, 2006. Google Scholar
Digital Library
- N. Bidoit and D. Colazzo. Testing XML constraint satisfiability. Electr. Notes Theor. Comput. Sci., 174(6), 2007. Google Scholar
Digital Library
- N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: optimal XML pattern matching. In SIGMOD, 2002. Google Scholar
Digital Library
- S. Cohen, B. Kimelfeld, and Y. Sagiv. Incorporating constraints in probabilistic XML (extended version). Can be found in the second author's home page (http://www.cs.huji.ac.il/ bennyk), 2008.Google Scholar
- G. F. Cooper. The computational complexity of probabilistic inference using Bayesian belief networks. Artif. Intell., 42(2-3), 1990. Google Scholar
Digital Library
- P. Dagum and M. Luby. Approximating probabilistic inference in bayesian belief networks is NP-hard. Artif. Intell., 60(1), 1993. Google Scholar
Digital Library
- N. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. In VLDB, 2004. Google Scholar
Digital Library
- N. N. Dalvi and D. Suciu. The dichotomy of conjunctive queries on probabilistic structures. In PODS, 2007. Google Scholar
Digital Library
- W. Fan and L. Libkin. On XML integrity constraints in the presence of DTDs. J. ACM, 49(3), 2002. Google Scholar
Digital Library
- W. Fan and J. Siméon. Integrity constraints for XML. J. Comput. Syst. Sci., 66(1), 2003. Google Scholar
Digital Library
- E. Hung, L. Getoor, and V. S. Subrahmanian. Probabilistic interval XML. In ICDT, 2003. Google Scholar
Digital Library
- E. Hung, L. Getoor, and V. S. Subrahmanian. PXML: A probabilistic semistructured data model and algebra. In ICDE, 2003.Google Scholar
Cross Ref
- B. Kimelfeld, Y. Kosharovski, and Y. Sagiv. Query efficiency in probabilistic XML models. In SIGMOD, 2008. Google Scholar
Digital Library
- B. Kimelfeld and Y. Sagiv. Matching twigs in probabilistic XML. In VLDB, 2007. Google Scholar
Digital Library
- B. Kimelfeld and Y. Sagiv. Maximally joining probabilistic data. In PODS, 2007. Google Scholar
Digital Library
- A. Nierman and H. V. Jagadish. ProTDB: Probabilistic data in XML. In VLDB, 2002. Google Scholar
Digital Library
- J. Pearl. Bayesian networks: A model of self-activated memory for evidential reasoning. In CogSci, 1985.Google Scholar
- J. S. Provan and M. O. Ball. The complexity of counting cuts and of computing the probability that a graph is connected. SIAM J. Comput., 12(4), 1983.Google Scholar
Cross Ref
- C. Re, N. N. Dalvi, and D. Suciu. Efficient top-k query evaluation on probabilistic data. In ICDE, 2007.Google Scholar
Cross Ref
- C. Re and D. Suciu. Efficient evaluation of HAVING queries on a probabilistic database. In DBPL, 2007. Google Scholar
Digital Library
- P. Senellart and S. Abiteboul. On the complexity of managing probabilistic XML data. In PODS, 2007. Google Scholar
Digital Library
- H. Tamaki and T. Sato. OLD resolution with tabulation. In ICLP, 1986. Google Scholar
Digital Library
- S. Toda and M. Ogiwara. Counting classes are at least as hard as the polynomial-time hierarchy. SIAM J. Comput., 21(2), 1992. Google Scholar
Digital Library
- D. S. Warren. Memoing for logic programs. Commun. ACM, 35(3), 1992. Google Scholar
Digital Library
Index Terms
Incorporating constraints in probabilistic XML
Recommendations
Incorporating constraints in probabilistic XML
Constraints are important, not only for maintaining data integrity, but also because they capture natural probabilistic dependencies among data items. A probabilistic XML database (PXDB) is the probability subspace comprising the instances of a p-...
On the expressiveness of probabilistic XML models
Various known models of probabilistic XML can be represented as instantiations of the abstract notion of p-documents. In addition to ordinary nodes, p-documents have distributional nodes that specify the possible worlds and their probabilistic ...
Efficient processing of top-k twig queries over probabilistic XML data
The flexibility of XML data model allows a more natural representation of uncertain data compared with the relational model. Matching twig pattern against XML data is a fundamental problem in querying information from XML documents. For a probabilistic ...






Comments