Abstract
Learnability in Valiant's PAC learning model has been shown to be strongly related to the existence of uniform laws of large numbers. These laws define a distribution-free convergence property of means to expectations uniformly over classes of random variables. Classes of real-valued functions enjoying such a property are also known as uniform Glivenko-Cantelli classes. In this paper, we prove, through a generalization of Sauer's lemma that may be interesting in its own right, a new characterization of uniform Glivenko-Cantelli classes. Our characterization yields Dudley, Gine´, and Zinn's previous characterization as a corollary. Furthermore, it is the first based on a Gine´, and Zinn's previous characterization as a corollary. Furthermore, it is the first based on a simple combinatorial quantity generalizing the Vapnik-Chervonenkis dimension. We apply this result to obtain the weakest combinatorial condition known to imply PAC learnability in the statistical regression (or “agnostic”) framework. Furthermore, we find a characterization of learnability in the probabilistic concept model, solving an open problem posed by Kearns and Schapire. These results show that the accuracy parameter plays a crucial role in determining the effective complexity of the learner's hypothesis class.
- ALON, N., AND MILMAN, V. 1983. Embedding of 1k in finite dimension Banach spaces. Israel J. Math. 45, 265-280.Google Scholar
- ASSOUAD, P., AND DUDLEY, R. 1989. Minimax nonparametric estimation over classes of sets. Preprint.Google Scholar
- BARTLETT, P., AND LONG, P. 1995. More theorems about scale-sensitive dimensions and learning. In Proceedings of the 8th Annual Conference on Computational Learning Theory. ACM, New York, pp. 392-401. Google Scholar
- BARTLETT, P., LONG, P., AND WILLIAMSON, R. 1996. Fat-shattering and the learnability of realvalued functions. J. Comput. Syst. Sci. 52, 3, 434-452. Google Scholar
- BEN-DAVID, S., CESA-BIANCHI, N., HAUSSLER, D., AND LONG, P. 1995. Characterizations of learnability for classes of {0 ... n}-valued functions. J. Comput. Syst. Sci. 50, 1, 74-86. Google Scholar
- BLUMER, A., EHRENFEUCHT, A., HAUSSLER, D., AND WARMUTH, M. 1989. Learnability and Vapnik- Chervonenkis dimensions. J. ACM 36, 4 (Oct.), 929-965. Google Scholar
- COLLINS, K., SHOR, P., AND STEMBRIDGE, J. 1987. A lower bound for {0, 1, *} tournament codes. Disc. Math. 63, 15-19. Google Scholar
- DUDLEY, R. 1984. A course on empirical processes. In Lecture Notes in Mathematics, vol. 1097. Springer-Verlag, New York, pp. 2-142.Google Scholar
- DUDLEY, R., GIN#, E., AND ZINN, J. 1991. Uniform and universal Glivenko-Cantelli classes. J. Theoret. Prob. 4, 485-510.Google Scholar
- GINI#, E., AND ZINN, J. 1984. Some limit theorems for empirical processes. Ann. Prob. 12, 929-989.Google Scholar
- GUYON, I., VAPNIK, g., BOSER, B., BOTTOU, L., AND SOLLA, S. 1991. Structural risk minimization for character recognition. In Proceedings of the 1991 Conference on Advances in Neural Information Processing Systems. pp. 471-479.Google Scholar
- HAUSSLER, D. 1992. Decision theoretic generalization of the PAC model for neural net and other learning applications. Inf. Comput. 100, 1, 78-150. Google Scholar
- HAUSSLER, D., AND LONG, P. 1995. A generalization of Sauer's lemma. J. Combin. Theory, Ser. A 71,219-240. Google Scholar
- KEARNS, M., AND SCHAPIRE, R. 1994. Efficient distribution-free learning of probabilistic concepts. J. Comput. Syst. Sci. 48, 3, 464-497. Google Scholar
- MILMAN, g. 1982. Some remarks about embedding of 1k in finite dimensional spaces. Israel J. Math. 43, 129-138.Google Scholar
- POLLARD, D. 1990. Empirical Processes: Theory and Applications, Volume 2 of NSF-CBMS Regional Conference Series in Probability and Statistics. Institute of Mathematical Statistics and American Statistical Association.Google Scholar
- RISSANEN, J. 1978. Modeling by shortest data description. Automatica 14, 465-471.Google Scholar
- SAUER, N. 1972. On the density of families of sets. J. Combin. Theory, Ser. A 13, 145-147.Google Scholar
- SHELAH, S. 1972. A combinatorial problem: Stability and order for models and theories in infinitary languages. Pac. J. Math. 41, 247-261.Google Scholar
- SIMON, H. 1994. Bounds on the number of examples needed for learning functions. In Proceedings of the 1st Euro-COLT Workshop. The Institute of Mathematics and Its Applications. pp. 83-94. Google Scholar
- VAN LINT, J. 1985. {0, 1, *} distance problems in combinatorics. In Lecture Notes of London Mathematical Society, vol. 103, Cambridge University Press, Cambridge, England, pp. 113-135.Google Scholar
- VAPNIK, g. 1982. Estimation of Dependencies Based on Empirical Data. Springer Verlag, New York. Google Scholar
- VAPNIK, V. 1989. Inductive principles of the search for empirical dependencies. In Proceedings of the 2nd Annual Workshop on a Computational Learning Theory. pp. 1-21. Google Scholar
- VAPNIK, g., AND CHERVONENKIS, A. 1971. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Prob. Its Applic. 16, 2, 264-280.Google Scholar
- VAPNIK, g., AND CHERVONENKIS, A. 1981. Necessary and sufficient conditions for uniform convergence of means to mathematical expectations. Theory Prob. Applic. 26, 3, 532-553.Google Scholar
Index Terms
Scale-sensitive dimensions, uniform convergence, and learnability
Recommendations
Scale-sensitive dimensions, uniform convergence, and learnability
SFCS '93: Proceedings of the 1993 IEEE 34th Annual Foundations of Computer ScienceLearnability in Valiant's PAC learning model has been shown to be strongly related to the existence of uniform laws of large numbers. These laws define a distribution-free convergence property of means to expectations uniformly over classes of random ...
On the Learnability of Disjunctive Normal Form Formulas
We present two related results about the learnability of disjunctive normal form (DNF) formulas. First we show that a common approach for learning arbitrary DNF formulas requires exponential time. We then contrast this with a polynomial time algorithm ...
A Characterization of List Learnability
STOC 2023: Proceedings of the 55th Annual ACM Symposium on Theory of ComputingA classical result in learning theory shows the equivalence of PAC learnability of binary hypothesis classes and the finiteness of VC dimension. Extending this to the multiclass setting was an open problem, which was settled in a recent breakthrough ...








Comments