ABSTRACT
We describe an algorithm that evaluates queries over probabilistic databases using Mobius' inversion formula in incidence algebras. The queries we consider are unions of conjunctive queries (equivalently: existential, positive First Order sentences), and the probabilistic databases are tuple-independent structures. Our algorithm runs in PTIME on a subset of queries called "safe" queries, and is complete, in the sense that every unsafe query is hard for the class FP#P. The algorithm is very simple and easy to implement in practice, yet it is non-obvious. Mobius' inversion formula, which is in essence inclusion-exclusion, plays a key role for completeness, by allowing the algorithm to compute the probability of some safe queries even when they have some subqueries that are unsafe. We also apply the same lattice-theoretic techniques to analyze an algorithm based on lifted conditioning, and prove that it is incomplete.
- N. Creignou and M. Hermann. Complexity of generalized satisfiability counting problems. Inf. Comput, 125(1):1--12, 1996. Google Scholar
Digital Library
- Nadia Creignou. A dichotomy theorem for maximum generalized satisfiability problems. J. Comput. Syst. Sci., 51(3):511--522, 1995. Google Scholar
Digital Library
- N. Dalvi, K. Schnaitter, and D. Suciu. Computing query probability with incidence algebras. Technical Report UW-CSE-10-03-02, University of Washington, March 2010.Google Scholar
Digital Library
- N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. In VLDB, Toronto, Canada, 2004. Google Scholar
Digital Library
- N. Dalvi and D. Suciu. The dichotomy of conjunctive queries on probabilistic structures. In PODS, pages 293--302, 2007. Google Scholar
Digital Library
- N. Dalvi and D. Suciu. Management of probabilistic data: Foundations and challenges. In PODS, pages 1--12, Beijing, China, 2007. (invited talk). Google Scholar
Digital Library
- Adnan Darwiche. A differential approach to inference in bayesian networks. Journal of the ACM, 50(3):280--305, 2003. Google Scholar
Digital Library
- Erich Grädel, Yuri Gurevich, and Colin Hirsch. The complexity of query reliability. In PODS, pages 227--234, 1998. Google Scholar
Digital Library
- Technical Report UW-CSE-10-03-02, University of Washington, March 2010.Google Scholar
- Dan Olteanu and Jiewen Huang. Secondary-storage confidence computation for conjunctive queries with inequalities. In SIGMOD, pages 389--402, 2009. Google Scholar
Digital Library
- Dan Olteanu, Jiewen Huang, and Christoph Koch. Sprout: Lazy vs. eager query plans for tuple-independent probabilistic databases. In ICDE, pages 640--651, 2009. Google Scholar
Digital Library
- D. Poole. First-order probabilistic inference. In IJCAI, 2003. Google Scholar
Digital Library
- Yehoushua Sagiv and Mihalis Yannakakis. Equivalences among relational expressions with the union and difference operators. Journal of the ACM, 27:633--655, 1980. Google Scholar
Digital Library
- P. Sen, A.Deshpande, and L. Getoor. Bisimulation-based approximate lifted inference. In UAI, 2009. Google Scholar
Digital Library
- Parag Singla and Pedro Domingos. Lifted first-order belief propagation. In AAAI, pages 1094--1099, 2008. Google Scholar
Digital Library
- Richard P. Stanley. Enumerative Combinatorics. Cambridge University Press, 1997. Google Scholar
Digital Library
- Ingo Wegener. BDDs--design, analysis, complexity, and applications. Discrete Applied Mathematics, 138(1-2):229--251, 2004. Google Scholar
Digital Library
Index Terms
Computing query probability with incidence algebras
Recommendations
An Incidence Algebra Approach to Knowledge Granulation in Pawlak Information Systems
Concurrency, Specification and ProgrammingRepresentation theory is a branch of mathematics whose original purpose was to represent information about abstract algebraic structures by means of methods of linear algebra usually, by linear transformations and matrices. G.-C. Rota in his famous ...
Scrubbing query results from probabilistic databases
IDEAS '11: Proceedings of the 15th Symposium on International Database Engineering & ApplicationsQueries over probabilistic databases lead to probabilistic results. As the process of arriving at these results is based on underlying data probabilities, we believe involving a user in the loop of query processing and leveraging the user's personal ...
Spatial query processing for fuzzy objects
Range and nearest neighbor queries are the most common types of spatial queries, which have been investigated extensively in the last decades due to its broad range of applications. In this paper, we study this problem in the context of fuzzy objects ...






Comments