skip to main content
research-article

Optimal Distribution-Free Sample-Based Testing of Subsequence-Freeness with One-Sided Error

Published:23 March 2022Publication History
Skip Abstract Section

Abstract

In this work, we study the problem of testing subsequence-freeness. For a given subsequence (word) w = w1wk, a sequence (text) T = t1tn is said to contain w if there exist indices 1 ≤ i1 < … < ik ≤ n such that tij = wj for every 1 ≤ jk. Otherwise, T is w-free. While a large majority of the research in property testing deals with algorithms that perform queries, here we consider sample-based testing (with one-sided error). In the “standard” sample-based model (i.e., under the uniform distribution), the algorithm is given samples (i, ti) where i is distributed uniformly independently at random. The algorithm should distinguish between the case that T is w-free, and the case that T is ε-far from being w-free (i.e., more than an ε-fraction of its symbols should be modified so as to make it w-free). Freitag, Price, and Swartworth (Proceedings of RANDOM, 2017) showed that O((k2 log k)ε) samples suffice for this testing task. We obtain the following results.

The number of samples sufficient for one-sided error sample-based testing (under the uniform distribution) is O(kε). This upper bound builds on a characterization that we present for the distance of a text T from w-freeness in terms of the maximum number of copies of w in T, where these copies should obey certain restrictions.

We prove a matching lower bound, which holds for every word w. This implies that the above upper bound is tight.

The same upper bound holds in the more general distribution-free sample-based model. In this model, the algorithm receives samples (i, ti) where i is distributed according to an arbitrary distribution p (and the distance from w-freeness is measured with respect to p).

We highlight the fact that while we require that the testing algorithm work for every distribution and when only provided with samples, the complexity we get matches a known lower bound for a special case of the seemingly easier problem of testing subsequence-freeness with one-sided error under the uniform distribution and with queries (Canonne et al., Theory of Computing, 2019).

REFERENCES

  1. [1] Alon Noga, Hod Rani, and Weinstein Amit. 2016. On active and passive testing. Combinatorics, Probability and Computing 25, 1 (2016), 120.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Alon Noga, Krivelevich Michael, Newman Ilan, and Szegedy Mario. 2001. Regular languages are testable with a constant number of queries. SIAM Journal on Computing 30, 6 (2001), 1842–1862. Google ScholarGoogle Scholar
  3. [3] Balcan Maria, Blais Eric, Blum Avrim, and Yang Liu. 2012. Active property testing. In Proceedings of the 53rd Annual Symposium on Foundations of Computer Science (FOCS). 2130.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Bathie Gabriel and Starikovskaya Tatiana. 2021. Property testing of regular languages with applications to streaming property testing of visibly pushdown languages. In Proceedings of the 48th International Colloquium Automata, Languages and Programming. 119:1–119:17.Google ScholarGoogle Scholar
  5. [5] Ben-Eliezer Omri. 2019. Testing local properties of arrays. In Proceedings of the 10th Innovations in Theoretical Computer Science conference (ITCS). 11:1–11:20.Google ScholarGoogle Scholar
  6. [6] Ben-Eliezer Omri and Canonne Clément L.. 2018. Improved bounds for testing forbidden order patterns. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 20932112.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Ben-Eliezer Omri, Canonne Clément L., Letzter Shoham, and Waingarten Erik. 2019. Finding monotone patterns in sublinear time. In Proceedings of the 16teeth Annual Symposium on Foundations of Computer Science (FOCS). 14691494.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Ben-Eliezer Omri, Korman Simon, and Reichman Daniel. 2017. Deleting and testing forbidden patterns in multi-dimensional arrays. In Proceedings of the 44th International Colloquium Automata, Languages and Programming. 9:1–9:14.Google ScholarGoogle Scholar
  9. [9] Berman Piotr, Murzabulatov Meiram, and Raskhodnikova Sofya. 2016. Testing convexity of figures under the uniform distribution. In Proceedings of the 32nd International Symposium on Computational Geometry (SoCG). 17:1–17:15.Google ScholarGoogle Scholar
  10. [10] Berman Piotr, Murzabulatov Meiram, and Raskhodnikova Sofya. 2016. Tolerant testers of image properties. In Proceedings of the 43rd International Colloquium Automata, Languages and Programming. 462:1–462:14.Google ScholarGoogle Scholar
  11. [11] Blais Eric, Jr. Renato Ferreira Pinto, and Harms Nathaniel. 2021. VC dimension and distribution-free sample-based testing. In Proceedings of the 53rd Annual ACM Symposium on the Theory of Computing (STOC). 504517.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Bshouty Nader. 2019. Almost optimal distribution-free junta testing. In Proceedings of the 34th IEEE Annual Conference on Computational Complexity (CCC). 2:1–2:13.Google ScholarGoogle Scholar
  13. [13] Bshouty Nader. 2020. Almost optimal testers for concise representations. In Proceedings of the 24th International Workshop on Randomization and Computation (RANDOM). 5:1–5:20.Google ScholarGoogle Scholar
  14. [14] Canonne Clément L.. 2020. A Survey on Distribution Testing: Your Data is Big. But is it Blue? 1–100. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Canonne Clément L., Grigorescu Elena, Guo Siyao, Kumar Akash, and Wimmer Karl. 2019. Testing \( k \)-monotonicity: The rise and fall of boolean functions. Theory of Computing 15, 1 (2019), 155.Google ScholarGoogle Scholar
  16. [16] Chen Xi, Liu Zhengyang, Servedio Rocco A., Sheng Ying, and Xie Jinyu. 2018. Distribution-free junta testing. In Proceedings of the 15th Annual ACM Symposium on the Theory of Computing (STOC). 749759.Google ScholarGoogle Scholar
  17. [17] Chen Xi and Xie Jinyu. 2016. Tight bounds for the distribution-free testing of monotone conjuctions. In Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 5471.Google ScholarGoogle Scholar
  18. [18] Cormen Thomas H., Leiserson Charles E., Rivest Ronald L., and Stein Clifford. 2009. Introduction to Algorithms (3rd. ed.). MIT Press and McGraw-Hill.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Diakonikolas Ilias and Kane Daniel. 2016. A new approach for testing properties of discrete distributions. In Proceedings of the 57th Annual Symposium on Foundations of Computer Science (FOCS). 685694.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Dodis Yevgeniy, Goldreich Oded, Lehman Eric, Raskhodnikova Sofya, Ron Dana, and Samorodnitsky Alex. 1999. Improved bounds for testing monotonicity. In Proceedings of the 3rd International Workshop on Randomization and Approximation Techniques in Computer Science (RANDOM). 97108.Google ScholarGoogle Scholar
  21. [21] Dolev Elya and Ron Dana. 2011. Distribution-free testing for monomials with a sublinear number of queries. Theory of Computing 7, 1 (2011), 155176.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Ergün Funda, Kannan Sampath, Kumar Ravi, Rubinfeld Ronitt, and Viswanathan Mahesh. 1998. Spot-checkers. In Proceedings of the 30th Annual ACM Symposium on the Theory of Computing (STOC). 259268.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Fischer Eldar, Lehman Eric, Newman Ilan, Raskhodnikova Sofya, Rubinfeld Ronitt, and Samorodnitsky Alex. 2002. Monotonicity testing over general poset domains. In Proceedings of the 34th Annual ACM Symposium on the Theory of Computing (STOC). 474483.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Fischer Eldar and Newman Ilan. 2007. Testing of matrix-poset properties. Combinatorica 27, 3 (2007), 293327.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Ford Lester R. and Fulkerson Delbert R.. 1956. Maximal flow through a network. Canadian Journal of Mathematics 8 (1956), 399404.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Freitag Cody R., Price Eric, and Swartworth William J.. 2017. Testing hereditary properties of sequences. In Proceedings of the 21st International Workshop on Randomization and Computation (RANDOM). 44:1–44:10.Google ScholarGoogle Scholar
  27. [27] Glasner Dana and Servedio Rocco A.. 2009. Distribution-free testing lower bound for basic boolean functions. Theory of Computing 5, 1 (2009), 191216.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Goldreich Oded. 2016. The uniform distribution is complete with respect to testing identity to a fixed distribution.ECCC TR16-015. To appear in the book: Computational Complexity and Property Testing, LNCS 12050, pages 152–172. 2020.Google ScholarGoogle Scholar
  29. [29] Goldreich Oded. 2017. Introduction to Property Testing. Cambridge University Press. Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Goldreich Oded, Goldwasser Shafi, Lehman Eric, Ron Dana, and Samorodnitsky Alex. 2000. Testing monotonicity. Combinatorica 20, 3 (2000), 301337.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Goldreich Oded, Goldwasser Shafi, and Ron Dana. 1998. Property testing and its connection to learning and approximation. Journal of the ACM 45, 4 (1998), 653750.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Goldreich Oded and Ron Dana. 2016. On sample-based testers. ACM Transactions on Computing Theory 8, 2 (2016), 7:1–7:54.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Halevy Shiri and Kushilevitz Eyal. 2007. Distribution-free property testing. SIAM Journal on Computing 37, 4 (2007), 11071138.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Kearns Michael and Ron Dana. 2000. Testing problems with sub-learning sample complexity. Journal of Computer and System Sciences 61, 3 (2000), 428456.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Newman Ilan, Rabinovich Yuri, Rajendraprasad Deepak, and Sohler Christian. 2019. Testing for forbidden order patterns in an array. Random Structures and Algorithms 55, 2 (2019), 402426.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Pallavoor Ramesh Krishnan S., Raskhodnikova Sofya, and Varma Nithin. 2018. Parameterized property testing of functions. ACM Transactions on Computing Theory 9, 4 (2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Raskhodnikova Sofya, Ron Dana, Shpilka Amir, and Smith Adam. 2009. Strong lower bounds for approximating distribution support size and the distinct elements problem. SIAM Journal on Computing 39, 3 (2009), 813842.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Ron Dana and Rosin Asaf. 2020. Almost optimal distribution-free sample-based testing of \( k \)-modality. In Proceedings of the 24th International Workshop on Randomization and Computation (RANDOM). 27:1–27:19.Google ScholarGoogle Scholar
  39. [39] Rubinfeld Ronitt and Sudan Madhu. 1996. Robust characterization of polynomials with applications to program testing. SIAM Journal on Computing 25, 2 (1996), 252271.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Szpankowski Wojciech. 2001. Average Case Analysis of Algorithms on Sequences. Wiley-Interscience, New York.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Optimal Distribution-Free Sample-Based Testing of Subsequence-Freeness with One-Sided Error

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Computation Theory
      ACM Transactions on Computation Theory  Volume 14, Issue 1
      March 2022
      155 pages
      ISSN:1942-3454
      EISSN:1942-3462
      DOI:10.1145/3505197
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 March 2022
      • Accepted: 1 November 2021
      • Revised: 1 September 2021
      • Received: 1 January 2021
      Published in toct Volume 14, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)57
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!