10.1145/1102351.1102358acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicpsprocConference Proceedings
ARTICLE

Error limiting reductions between classification tasks

ABSTRACT

We introduce a reduction-based model for analyzing supervised learning tasks. We use this model to devise a new reduction from multi-class cost-sensitive classification to binary classification with the following guarantee: If the learned binary classifier has error rate at most ε then the cost-sensitive classifier has cost at most 2ε times the expected sum of costs of all possible lables. Since cost-sensitive classification can embed any bounded loss finite choice supervised learning task, this result shows that any such task can be solved using a binary classification oracle. Finally, we present experimental results showing that our new reduction outperforms existing algorithms for multi-class cost-sensitive learning.

References

  1. Allwein, E., Schapire, R., & Singer, Y. (2001). Reducing multiclass to binary: A unifying approach for margin classifiers. J. of Machine Learning Research, 1, 113--141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Breiman, L. (1996). Bagging predictors. Machine Learning, 26, 123--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Dietterich, T. G., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Domingos, P. (1999). Metacost: A general method for making classifiers cost-sensitive. Proceedings of the 5th KDD Conference (pp. 155--164). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Elkan, C. (1997). Boosting and naive bayesian learning (Technical Report CS97--557). UC San Diego.Google ScholarGoogle Scholar
  6. Fox, J. (1997). Applied regression analysis, linear models, and related methods. Sage Publications.Google ScholarGoogle Scholar
  7. Freund, Y., & Schapire, R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. J. of Comp. and Sys. Sci., 55, 119--139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Guruswami, V., & Sahai, A. (1999). Multiclass learning, boosting, and error-correcting codes. Proceedings of the 12th Annual Conference on Computational Learning Theory (COLT) (pp. 145--155). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Hastie, T., & Tibshirani, R. (1997). Classification by pairwise coupling. Advances in Neural Information Processing Systems 10 (NIPS) (pp. 507--513). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kalai, A., & Servedio, R. (2003). Boosting in the presence of noise. Proceedings of the 35th Annual ACM Symposium on Theory of Computing (STOC) (pp. 195--205). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Pitt, L., & Warmuth, M. (1990). Prediction-preserving reducibility. J. of Comp. and Sys. Sci., 41, 430--467. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Quinlan, J. (1993). C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Schapire, R. (1997). Using output codes to boost multiclass learning problems. Proceedings of the 14th International Conference on Machine Learning (ICML) (pp. 313--321). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Schapire, R., & Singer, Y. (1999). Improved boosting using confidence-rated predictions. Machine Learning, 37, 297--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Valiant, L. (1984). A theory of the learnable. Communications of the ACM, 27, 1134--1142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Vapnik, V. N., & Chevonenkis, A. Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of probability and its applications, 16, 264--280.Google ScholarGoogle Scholar
  17. Zadrozny, B., & Elkan, C. (2002). Transforming classifier scores into accurate multiclass probability estimates. Proceedings of the 8th International Conference on Knowledge Discovery and Data Mining (KDD) (pp. 694--699). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Zadrozny, B., Langford, J., & Abe, N. (2003). Cost sensitive learning by cost-proportionate example weighting. Proceedings of the 3rd ICDM Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

(auto-classified)
  1. Error limiting reductions between classification tasks

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!