ABSTRACT
We introduce a reduction-based model for analyzing supervised learning tasks. We use this model to devise a new reduction from multi-class cost-sensitive classification to binary classification with the following guarantee: If the learned binary classifier has error rate at most ε then the cost-sensitive classifier has cost at most 2ε times the expected sum of costs of all possible lables. Since cost-sensitive classification can embed any bounded loss finite choice supervised learning task, this result shows that any such task can be solved using a binary classification oracle. Finally, we present experimental results showing that our new reduction outperforms existing algorithms for multi-class cost-sensitive learning.
References
- Allwein, E., Schapire, R., & Singer, Y. (2001). Reducing multiclass to binary: A unifying approach for margin classifiers. J. of Machine Learning Research, 1, 113--141. Google Scholar
Digital Library
- Breiman, L. (1996). Bagging predictors. Machine Learning, 26, 123--140. Google Scholar
Digital Library
- Dietterich, T. G., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263--286. Google Scholar
Digital Library
- Domingos, P. (1999). Metacost: A general method for making classifiers cost-sensitive. Proceedings of the 5th KDD Conference (pp. 155--164). Google Scholar
Digital Library
- Elkan, C. (1997). Boosting and naive bayesian learning (Technical Report CS97--557). UC San Diego.Google Scholar
- Fox, J. (1997). Applied regression analysis, linear models, and related methods. Sage Publications.Google Scholar
- Freund, Y., & Schapire, R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. J. of Comp. and Sys. Sci., 55, 119--139. Google Scholar
Digital Library
- Guruswami, V., & Sahai, A. (1999). Multiclass learning, boosting, and error-correcting codes. Proceedings of the 12th Annual Conference on Computational Learning Theory (COLT) (pp. 145--155). Google Scholar
Digital Library
- Hastie, T., & Tibshirani, R. (1997). Classification by pairwise coupling. Advances in Neural Information Processing Systems 10 (NIPS) (pp. 507--513). Google Scholar
Digital Library
- Kalai, A., & Servedio, R. (2003). Boosting in the presence of noise. Proceedings of the 35th Annual ACM Symposium on Theory of Computing (STOC) (pp. 195--205). Google Scholar
Digital Library
- Pitt, L., & Warmuth, M. (1990). Prediction-preserving reducibility. J. of Comp. and Sys. Sci., 41, 430--467. Google Scholar
Digital Library
- Quinlan, J. (1993). C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann. Google Scholar
Digital Library
- Schapire, R. (1997). Using output codes to boost multiclass learning problems. Proceedings of the 14th International Conference on Machine Learning (ICML) (pp. 313--321). Google Scholar
Digital Library
- Schapire, R., & Singer, Y. (1999). Improved boosting using confidence-rated predictions. Machine Learning, 37, 297--336. Google Scholar
Digital Library
- Valiant, L. (1984). A theory of the learnable. Communications of the ACM, 27, 1134--1142. Google Scholar
Digital Library
- Vapnik, V. N., & Chevonenkis, A. Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of probability and its applications, 16, 264--280.Google Scholar
- Zadrozny, B., & Elkan, C. (2002). Transforming classifier scores into accurate multiclass probability estimates. Proceedings of the 8th International Conference on Knowledge Discovery and Data Mining (KDD) (pp. 694--699). Google Scholar
Digital Library
- Zadrozny, B., Langford, J., & Abe, N. (2003). Cost sensitive learning by cost-proportionate example weighting. Proceedings of the 3rd ICDM Conference. Google Scholar
Digital Library
Index Terms
(auto-classified)Error limiting reductions between classification tasks



Comments