ABSTRACT
Key Violations often occur in real-life datasets, especially in those integrated from different sources. Enforcing constraints strictly on these datasets is not feasible. In this paper we formalize the notion of soft-key constraints on probabilistic databases, which allow for violation of key constraint by penalizing every violating world by a quantity proportional to the violation. To represent our probabilistic database with constraints, we define a class of markov networks, where we can do query evaluation in PTIME. We also study the evaluation of conjunctive queries on relations with soft keys and present a dichotomy that separates this set into those in PTIME and the rest which are #P-Hard.
- P. Andritsos, A. Fuxman, and R. J. Miller. Clean answers over dirty databases: A probabilistic approach. In ICDE '06, page 30, Washington, DC, USA, 2006. IEEE Computer Society. Google Scholar
Digital Library
- L. Antova, C. Koch, and D. Olteanu. MayBMS: Managing incomplete information with probabilistic world-set decompositions. In ICDE, 2007.Google Scholar
Cross Ref
- D. Barbara, H. Garcia-Molina, and D. Porter. The management of probabilistic data. IEEE Trans. on Knowledge and Data Eng., 1992. Google Scholar
Digital Library
- O. Benjelloun, A. D. Sarma, A. Y. Halevy, and J. Widom. ULDBs: Databases with uncertainty and lineage. In VLDB, pages 953--964, 2006. Google Scholar
Digital Library
- O. Benjelloun, A. D. Sarma, C. Hayworth, and J. Widom. An introduction to ULDBs and the Trio system. IEEE Data Eng. Bull, 29(1):5--16, 2006.Google Scholar
- R. Cavallo and M. Pittarelli. The theory of probabilistic databases. In Proceedings of VLDB, pages 71--81, 1987. Google Scholar
Digital Library
- R. Cowell, P. Dawid, S. Lauritzen, and D. Spiegelhalter, editors. Probabilistic Networks and Expert Systems. Springer, 1999. Google Scholar
Digital Library
- N. Dalvi and D. Suciu. The dichotomy of conjunctive queries on probabilistic structures. In PODS, pages 293--302, 2007. Google Scholar
Digital Library
- N. Dalvi and D. Suciu. Management of probabilistic data: foundations and challenges. In PODS '07, pages 1--12, New York, NY, USA, 2007. ACM Press. Google Scholar
Digital Library
- N. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. VLDB, 2004. Google Scholar
Digital Library
- N. N. Dalvi and D. Suciu. Management of probabilistic data: foundations and challenges. In PODS, 2007. Google Scholar
Digital Library
- N. Fuhr and T. Roelleke. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst., 15(1):32--66, 1997. Google Scholar
Digital Library
- Fuhr, Norbert. A probabilistic relational model for the integration of IR and databases. In SIGIR, 1993. Google Scholar
Digital Library
- L. Getoor. An introduction to probabilistic graphical models for relational data. Data Engineering Bulletin, 29(1), march 2006.Google Scholar
- R. Gupta, A. Diwan, and S. Sarawagi. Efficient inference with cardinality-based clique potentials. In ICML. ACM, 2007. Google Scholar
Digital Library
- I. Ilyas, V. Markl, P. Haas, P. Brown, and A. Aboulnaga. Cords: Automatic discovery of correlations and soft functional dependencies. In SIGMOD, pages 647--658, 2004. Google Scholar
Digital Library
- J. Pearl. Probabilistic Reasoning in Intelligent Systems : Networks of Plausible Inference. Morgan Kaufmann, September 1988. Google Scholar
Digital Library
- H. Poon and P. Domingos. Joint inference in information extraction. In AAAI, pages 913--918, 2007. Google Scholar
Digital Library
- C. Re, N. Dalvi, and D. Suciu. Efficient Top-k query evaluation on probabilistic data. In ICDE, 2007.Google Scholar
Cross Ref
- C. Re, N. N. Dalvi, and D. Suciu. Query evaluation on probabilistic databases. IEEE Data Eng. Bull, 2006.Google Scholar
- C. Re and D.Suciu. Efficient evaluation of having queries on a probabilistic database. In Proceedings of DBPL, 2007. Google Scholar
Digital Library
- M. Richardson and P. Domingos. Markov logic networks. Mach. Learn., 62(1-2):107--136, 2006. Google Scholar
Digital Library
- F. Sadri. Integrity constraints in the information source tracking method. IEEE Transactions on Knowledge and Data Engineering, 1995. Google Scholar
Digital Library
- P. Sen and A. Deshpande. Representing and querying correlated tuples in probabilistic databases. In ICDE. IEEE, 2007.Google Scholar
Cross Ref
- W. Shen, X. Li, and A. Doan. Constraint-based entity matching. In AAAI, pages 862--867, 2005. Google Scholar
Digital Library
- P. Singla and P. Domingos. Entity resolution with markov logic. In ICDM, pages 572--582, 2006. Google Scholar
Digital Library
- S. Staworko, J. Chomicki, and J. Marcinkowski. Preference-driven querying of inconsistent relational databases. In EDBT Workshops, pages 318--335, 2006. Google Scholar
Digital Library
- L. G. Valiant. The complexity of enumeration and reliability problems. SIAM Journal on Computing, 8(3):410--421, 1979.Google Scholar
Digital Library
Index Terms
Query evaluation with soft-key constraints
Recommendations
Constraint-Based Query Evaluation in Deductive Databases
Constraints play an important role in the efficient query evaluation in deductive databases. Constraint-based query evaluation in deductive databases is investigated, with emphasis on linear recursions with function symbols. Constraints are grouped into ...
Sensitivity analysis and explanations for robust query evaluation in probabilistic databases
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of dataProbabilistic database systems have successfully established themselves as a tool for managing uncertain data. However, much of the research in this area has focused on efficient query evaluation and has largely ignored two key issues that commonly ...
Possibilistic constraint satisfaction problems or "How to handle soft constraints ?"
UAI'92: Proceedings of the Eighth international conference on Uncertainty in artificial intelligenceMany AI synthesis problems such as planning or scheduling may be modelized as constraint satisfaction problems (CSP). A CSP is typically defined as the problem of finding any consistent labeling for a fixed set of variables satisfying all given ...






Comments