skip to main content
10.1145/1376916.1376932acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Approximating predicates and expressive queries on probabilistic databases

Published:09 June 2008Publication History

ABSTRACT

We study complexity and approximation of queries in an expressive query language for probabilistic databases. The language studied supports the compositional use of confidence computation. It allows for a wide range of new use cases, such as the computation of conditional probabilities and of selections based on predicates that involve marginal and conditional probabilities. These features have important applications in areas such as data cleaning and the processing of sensor data. We establish techniques for efficiently computing approximate query results and for estimating the error incurred by queries. The central difficulty is due to selection predicates based on approximated values, which may lead to the unreliable selection of tuples. A database may contain certain singularities at which approximation of predicates cannot be achieved; however, the paper presents an algorithm that provides efficient approximation otherwise.

References

  1. L. Antova, T. Jansen, C. Koch, and D. Olteanu. "Fast and Simple Relational Processing of Uncertain Data". In Proc. ICDE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Antova, C. Koch, and D. Olteanu. "From Complete to Incomplete Information and Back". In Proc. SIGMOD, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Antova, C. Koch, and D. Olteanu. "Query language support for incomplete information in the MayBMS system". In Proc. VLDB, 2007. Demonstration Paper. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Antova, C. Koch, and D. Olteanu. "World-set Decompositions: Expressiveness and Efficient Algorithms". In Proc. ICDT, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. O. Benjelloun, A. Das Sarma, A. Halevy, and J. Widom. "ULDBs: Databases with Uncertainty and Lineage". In Proc. VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Boulos, N. Dalvi, B. Mandhani, S. Mathur, C. Re, and D. Suciu. MYSTIQ: a system for finding more answers by using probabilities. In Proc. SIGMOD, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. Dalvi and D. Suciu. "Efficient query evaluation on probabilistic databases". In Proc. VLDB, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Dalvi and D. Suciu. "The dichotomy of conjunctive queries on probabilistic structures". In Proc. PODS, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. de Rougemont. "The Reliability of Queries". In Proc. PODS, pages 286--291, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Grädel, Y. Gurevich, and C. Hirsch. "The Complexity of Query Reliability". In Proc. PODS, pages 227--234, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Y. Halpern. Reasoning about Uncertainty. MIT Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. M. Hellerstein, P. J. Haas, and H. J. Wang. "Online Aggregation". In Proc. SIGMOD, pages 171--182, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. M. Jermaine, S. Arumugam, A. Pol, and A. Dobra. "Scalable approximate query processing with the DBO engine". In Proc. SIGMOD, pages 725--736, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. M. Karp and M. Luby. "Monte-Carlo Algorithms for Enumeration and Reliability Problems". In Proc. FOCS, pages 56--64, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Mitzenmacher and E. Upfal. Probability and Computing. Cambridge University Press, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Re, N. Dalvi, and D. Suciu. Efficient top-k query evaluation on probabilistic data. In Proc. ICDE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  17. P. Sen and A. Deshpande. "Representing and Querying Correlated Tuples in Probabilistic Databases". In Proc. ICDE, pages 596--605, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  18. Stanford Trio Project. "TriQL -- The Trio Query Language", 2006. http://infolab.stanford.edu/~widom/triql.html.Google ScholarGoogle Scholar
  19. M. Y. Vardi. "The Complexity of Relational Query Languages". In Proc. STOC, pages 137--146, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Approximating predicates and expressive queries on probabilistic databases

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          PODS '08: Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
          June 2008
          330 pages
          ISBN:9781605581521
          DOI:10.1145/1376916

          Copyright © 2008 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 June 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate476of1,835submissions,26%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!