skip to main content
research-article

Property Testing of Joint Distributions using Conditional Samples

Published:22 August 2018Publication History
Skip Abstract Section

Abstract

In this article, we consider the problem of testing properties of joint distributions under the Conditional Sampling framework. In the standard sampling model, sample complexity of testing properties of joint distributions are exponential in the dimension, resulting in inefficient algorithms for practical use. While recent results achieve efficient algorithms for product distributions with significantly smaller sample complexity, no efficient algorithm is expected when the marginals are not independent.

In this article, we initialize the study of conditional sampling in the multidimensional setting. We propose a subcube conditional sampling model where the tester can condition on a (adaptively) chosen subcube of the domain. Due to its simplicity, this model is potentially implementable in many practical applications, particularly when the distribution is a joint distribution over Σn for some set Σ.

We present algorithms for various fundamental properties of distributions in the subcube-conditioning model and prove that the sample complexity is polynomial in the dimension n (and not exponential as in the traditional model). We present an algorithm for testing identity to a known distribution using Õ(n2)-subcube-conditional samples, an algorithm for testing identity between two unknown distributions using Õ(n5)-subcube-conditional samples and an algorithm for testing identity to a product distribution using Õ(n5)-subcube-conditional samples.

The central concept of our technique involves an elegant chain rule, which can be proved using basic techniques of probability theory, yet it is powerful enough to avoid the curse of dimensionality.

References

  1. Jayadev Acharya, Clément L. Canonne, and Gautam Kamath. 2015a. A chasm between identity and equivalence testing with conditional queries. In Proceedings of the Conference on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM’15). 449--466.Google ScholarGoogle Scholar
  2. Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath. 2015b. Optimal testing for properties of distributions. In Proceedings of the 28th Annual Conference on Neural Information Processing Systems. 3591--3599. http://arxiv.org/abs/1507.05952 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Tuǧkan Batu, Sanjoy Dasgupta, Ravi Kumar, and Ronitt Rubinfeld. 2005. The complexity of approximating the entropy. SIAM J. Comput. 35, 1 (2005), 132--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Tuǧkan Batu, Lance Fortnow, Eldar Fischer, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. 2001. Testing random variables for independence and identity. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science (FOCS’01), Bob Werner (Ed.). Los Alamitos, CA, 442--451. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tuǧkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. 2013. Testing closeness of discrete distributions. J. ACM 60, 1, Article 4 (Feb. 2013), 4:1--4:25 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Clément L. Canonne. 2015a. Big data on the rise? Testing monotonicity of distributions. In Proceedings of the 42nd International Colloquium on Automata, Languages, and Programming (ICALP’15). 294--305.Google ScholarGoogle ScholarCross RefCross Ref
  7. Clément L. Canonne. 2015b. A survey on distribution testing: Your data is big. but is it blue? Electron. Colloq. Comput. Complex. 22 (2015), 63. Retrieved from http://eccc.hpi-web.de/report/2015/063.Google ScholarGoogle Scholar
  8. Clément L. Canonne, Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart. 2017. Testing Bayesian networks. In Proceedings of the 30th Conference on Learning Theory (COLT’17). 370--448. Retrieved from http://arxiv.org/abs/1612.03156.Google ScholarGoogle Scholar
  9. Clément L. Canonne, Dana Ron, and Rocco A. Servedio. 2015. Testing probability distributions using conditional samples. SIAM J. Comput. 44, 3 (2015), 540--616.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sourav Chakraborty, Eldar Fischer, Yonatan Goldhirsh, and Arie Matsliah. 2016. On the power of conditional samples in distribution testing. SIAM J. Comput. 45, 4 (2016), 1261--1296.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sourav Chakraborty and Kuldeep Meel. 2016. Testing correctness of programs that claim to produce satisfying assignments uniformly at random. Under Preparation (2016).Google ScholarGoogle Scholar
  12. Siuon Chan, Ilias Diakonikolas, Paul Valiant, and Gregory Valiant. 2014. Optimal algorithms for testing closeness of discrete distributions. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’14), Chandra Chekuri (Ed.). SIAM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Constantinos Daskalakis, Nishanth Dikkala, and Gautam Kamath. 2018. Testing ising models. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’18). 1989--2007. Retrieved from http://arxiv.org/abs/1612.03147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Constantinos Daskalakis and Qinxuan Pan. 2017. Square Hellinger subadditivity for Bayesian networks and its applications to identity testing. In Proceedings of the 30th Conference on Learning Theory (COLT’17). 697--703. Retrieved from http://arxiv.org/abs/1612.03164.Google ScholarGoogle Scholar
  15. Ilias Diakonikolas and Daniel M. Kane. 2016. A new approach for testing properties of discrete distributions. In Proceedings of the IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS’16). 685--694. Retrieved from http://arxiv.org/abs/1601.05557.Google ScholarGoogle Scholar
  16. Moein Falahatgar, Ashkan Jafarpour, Alon Orlitsky, Venkatadheeraj Pichapati, and Ananda Theertha Suresh. 2015. Faster algorithms for testing under conditional sampling. In Proceedings of The 28th Conference on Learning Theory (COLT’15). 607--636.Google ScholarGoogle Scholar
  17. Eldar Fischer. 2004. The difficulty of testing for isomorphism against a graph that is given in advance. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing. 391--397. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Eldar Fischer, Ilan Newman, and Jirí Sgall. 2004. Functions that have read-twice constant width branching programs are not necessarily testable. Random Struct. Algor. 24, 2 (2004), 175--193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Oded Goldreich. 2017. Introduction to Property Testing. Cambridge University Press.Google ScholarGoogle Scholar
  20. Oded Goldreich, Shafi Goldwasser, and Dana Ron. 1998. Property testing and its connection to learning and approximation. J. ACM 45, 4 (1998), 653--750. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Oded Goldreich and Dana Ron. 2011. On testing expansion in bounded-degree graphs. In Studies in Complexity and Cryptography, Oded Goldreich (Ed.). Lecture Notes in Computer Science, Vol. 6650. Springer, 68--75. Retrieved from Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Themistoklis Gouleakis, Christos Tzamos, and Manolis Zampetakis. 2017. Faster sublinear algorithms using conditional sampling. In Proceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’17). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Reut Levi, Dana Ron, and Ronitt Rubinfeld. 2013. Testing properties of collections of distributions. Theory Comput. 9 (2013), 295--347. Retrieved fromGoogle ScholarGoogle ScholarCross RefCross Ref
  24. L. Paninski. 2008. A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Trans. Inf. Theor. 54, 10 (Oct. 2008), 4750--4755. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam Smith. 2009. Strong lower bounds for approximating distribution support size and the distinct elements problem. SIAM J. Comput. 39, 3 (2009), 813--842. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ronitt Rubinfeld and Madhu Sudan. 1996. Robust characterizations of polynomials with applications to program testing. SIAM J. Comput. 25, 2 (1996), 252--271. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Gregory Valiant and Paul Valiant. 2011. Estimating the unseen: An n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs. In Proceedings of the 43rd ACM Symposium on Theory of Computing (STOC’11). 685--694. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Gregory Valiant and Paul Valiant. 2014. An automatic inequality prover and instance optimal identity testing. In Proceedings of the 55th IEEE Annual Symposium on Foundations of Computer Science (FOCS’14). 51--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Paul Valiant. 2011. Testing symmetric properties of distributions. SIAM J. Comput. 40, 6 (2011), 1927--1968. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Property Testing of Joint Distributions using Conditional Samples

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Computation Theory
      ACM Transactions on Computation Theory  Volume 10, Issue 4
      December 2018
      121 pages
      ISSN:1942-3454
      EISSN:1942-3462
      DOI:10.1145/3271481
      Issue’s Table of Contents

      Copyright © 2018 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 August 2018
      • Accepted: 1 June 2018
      • Revised: 1 May 2018
      • Received: 1 March 2018
      Published in toct Volume 10, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!