skip to main content
10.1145/130385.130401acmconferencesArticle/Chapter ViewAbstractPublication PagescoltConference Proceedingsconference-collections
Article
Free access

A training algorithm for optimal margin classifiers

Published: 01 July 1992 Publication History

Abstract

A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to the decision boundary. Bounds on the generalization performance based on the leave-one-out method and the VC-dimension are given. Experimental results on optical character recognition problems demonstrate the good generalization obtained when compared with other learning algorithms.

References

[1]
M.A. Aizerman, E.M. Braverman, and L.I. Rozonoer. Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, 25:821-837, 1964.
[2]
E.B. Baum and D. Haussler. What size net gives valid generalization? Neural Computation, 1(1):151-160, 1989.
[3]
D.S. Broomhead and D. Lowe. Multivariate functional interpolation and adaptive networks. Complex Systems, 2:321 - 355, 1988.
[4]
Yann Le Cun, Bernhard Boser, John S. Denker, Donnie Henderson, Richard E. Howard, Wayne Hubbard, and Larry D. Jackel. Handwritten digit recognition with a back-propagation network. In David S. Touretzky, editor, Neural Information Processing Systems, volume 2, pages 396-404. Morgan Kaufmann Publishers, San Mateo, CA, 1990.
[5]
R. Courant and D. Hilbert. Methods of mathematical physics. Interscience, New York, 1953.
[6]
R.O. Duda and P.E. Hart. Pattern Classification And Scene Analysis. Wiley and Son, 1973.
[7]
S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4 (1):1 - 58, 1992.
[8]
I. Guyon, I. Poujaud, L. Personnaz, G. Dreyfus, J. Denker, and Y. LeCun. Comparing different neural network architectures for classifying handwritten digits. In Proc. Int. Joint Conf. Neural Networks. Int. Joint Conference on Neural Networks, 1989.
[9]
Isabelle Guyon, Vladimir Vapnik, Bernhard Boser, Leon Bottou, and Sara Solla. Structural risk minimization for character recognition. In David S. Touretzky, editor, Neural Information Processing Systems, volume 4. Morgan Kaufmann Publishers, San Mateo, CA, 1992. To appear.
[10]
David Haussler, Nick Littlestone, and Manfred Warmuth. Predicting 0,1-functions on randomly drawn points. In Proceedings of the 29th Annual Symposium on the Foundations of Computer Science, pages 100-109. IEEE, 1988.
[11]
W. Krauth and M. Mezard. Learning algorithms with optimal stability in neural networks. J. Phys. A: Math. gen., 20:L745, 1987.
[12]
F.A. Lootsma, editor. Numerical Methods for Non-linear Optimization. Academic Press, London, 1972.
[13]
David Luenberger. Linear and Nonlinear Programming. Addison-Wesley, 1984.
[14]
D. MacKay. A practical bayesian framework for backprop networks. In David S. Touretzky, editor, Neural Information Processing Systems, volume 4. Morgan Kaufmann Publishers, San Mateo, CA, 1992. To appear.
[15]
J. Moody and C. Darken. Fast learning in networks of locally tuned processing units. Neural Computation, i (2):281 - 294, 1989.
[16]
N. Matte, I. Guyon, L. Bottou, J. Denker, and V. Vapnik. Computer-aided cleaning of large databases for character recognition. In Digest ICPR. ICPR, Amsterdam, August 1992.
[17]
J. Moody. Generalization, weight decay, and architecture selection for nonlinear learning systems. In David S. Touretzky, editor, Neural Information Processing Systems, volume 4. Morgan Kaufrnann Publishers, San Mateo, CA, 1992. To appear.
[18]
A.S. Nemirovsky and D. Do Yudin. Problem Complexzty and Method Efficiency in Optimization. Wiley, New York, 1983.
[19]
S.M. Omohundro. Bumptrees for efficient function, constraint and classification learning. In R.P. Lippmann and et al., editors, NIPS-90, San Mateo CA, 1991. IEEE, Morgan Kaufmann.
[20]
T. Poggio and F. Girosi. Regularization algorithms for learning that are equivalent to nmltilayer networks. Science, 247:978 - 982, February 1990.
[21]
T. Poggio. On optimal nonlinear associative recall. Biol. Cybernetics, Vol. 19:201-209, 1975.
[22]
F. Rosenblatt. Princzples of neurodynamics. Spartan Books, New York, 1962.
[23]
P. Simard, Y. LeCun, and J. Denker. Tangent prop--a formalism for specifying selected invariances in an adaptive network. In David S. Touretzky, editor, Neural Information Processing Systems, volume 4. Morgan Kaufmann Publishers, San Mateo, CA, 1992. To appear.
[24]
N. Tishby, E. Levin, and S. A. Solla. Consistent inference of probabilities in layered networks: Predictions and generalization. In Proceedings of the International Joint Conference on Neural Networks, Washington DC, 1989.
[25]
Vladimir Vapnik. Estimation of Dependences Based on Empirical Data. Springer Verlag, New York, 1982.
[26]
V.N. Vapnik and A.Ya. Chervonenkis. The theory of pattern recognition. Nauka, Moscow, 1974.

Cited By

View all
  • (2025)Machine Learning and Artificial Intelligence Systems Based on the Optical Spectral Analysis in Neuro-OncologyPhotonics10.3390/photonics1201003712:1(37)Online publication date: 4-Jan-2025
  • (2025)Current methods for detecting and assessing HIV-1 antibody resistanceFrontiers in Immunology10.3389/fimmu.2024.144337715Online publication date: 6-Jan-2025
  • (2025)Hierarchical Abstracting Graph KernelIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.350902837:2(724-738)Online publication date: Feb-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
COLT '92: Proceedings of the fifth annual workshop on Computational learning theory
July 1992
452 pages
ISBN:089791497X
DOI:10.1145/130385
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 1992

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

COLT92
Sponsor:
COLT92: 5th Annual Workshop on Computational Learning Theory
July 27 - 29, 1992
Pennsylvania, Pittsburgh, USA

Acceptance Rates

Overall Acceptance Rate 35 of 71 submissions, 49%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4,606
  • Downloads (Last 6 weeks)487
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Machine Learning and Artificial Intelligence Systems Based on the Optical Spectral Analysis in Neuro-OncologyPhotonics10.3390/photonics1201003712:1(37)Online publication date: 4-Jan-2025
  • (2025)Current methods for detecting and assessing HIV-1 antibody resistanceFrontiers in Immunology10.3389/fimmu.2024.144337715Online publication date: 6-Jan-2025
  • (2025)Hierarchical Abstracting Graph KernelIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.350902837:2(724-738)Online publication date: Feb-2025
  • (2025)Age Group Discrimination via Free Handwriting IndicatorsIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2024.344449729:1(56-67)Online publication date: Jan-2025
  • (2025)Comparative Analysis of Re-Identification Methods of Multi-Criteria Decision Analysis ModelsIEEE Access10.1109/ACCESS.2024.352467213(8338-8354)Online publication date: 2025
  • (2025)Tokenizers for African LanguagesIEEE Access10.1109/ACCESS.2024.352228513(1046-1054)Online publication date: 2025
  • (2025)A Hybrid Machine Learning Model for Efficient XML ParsingIEEE Access10.1109/ACCESS.2024.352070613(382-393)Online publication date: 2025
  • (2025)Early detection of autism spectrum disorder: gait deviations and machine learningScientific Reports10.1038/s41598-025-85348-w15:1Online publication date: 6-Jan-2025
  • (2025)Transcriptomic profiling and machine learning reveal novel RNA signatures for enhanced molecular characterization of Hashimoto’s thyroiditisScientific Reports10.1038/s41598-024-80728-015:1Online publication date: 3-Jan-2025
  • (2025)VTA projections to M1 are essential for reorganization of layer 2-3 network dynamics underlying motor learningNature Communications10.1038/s41467-024-55317-416:1Online publication date: 2-Jan-2025
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media