ABSTRACT
Classification rule mining aims to discover a small set of rules in the database that forms an accurate classifier. Association rule mining finds all the rules existing in the database that satisfy some minimum support and minimum confidence constraints. For association rule mining, the target of discovery is not pre-determined, while for classification rule mining there is one and only one predetermined target. In this paper, we propose to integrate these two mining techniques. The integration is done by focusing on mining a special subset of association rules, called class association rules (CARs). An efficient algorithm is also given for building a classifier based on the set of discovered CARs. Experimental results show that the classifier built this way is, in general, more accurate than that produced by the state-of-the-art classification system C4.5. In addition, this integration helps to solve a number of problems that exist in the current classification systems.
References
- Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. VLDB-94, 1994. Google Scholar
Digital Library
- Ali, K., Manganaris, S. and Srikant, R. 1997. Partial classification using association rules. KDD-97, 115-118.Google Scholar
- Bayardo, R. J. 1997. Brute-force mining of high-confidence classification rules. KDD-97, 123-126.Google Scholar
- Breiman, L, Friedman, J., Olshen, R. and Stone C. 1984. Classification and regression trees. Belmont: Wadsworth.Google Scholar
- Clark, P. and Matwin, S. 1993. Using qualitative models to guide induction learning. ICML-93, 49-56.Google Scholar
- Dougherty, J., Kohavi, R. Sahami, M. 1995. Supervised and unsupervised discretization of continuous features. ICML-95.Google Scholar
- Fayyad, U. M. and Irani, K. B. 1993. Multi-interval discretization of continuous-valued attributes for classification learning. IJCAI-93, 1022-1027.Google Scholar
- Kohavi, R., John, G., Long, R., Manley, D., and Pfleger, K. 1994. MLC++: a machine learning library in C++. Tools with artificial intelligence, 740-743.Google Scholar
- Liu, B. and Hsu, W. 1996. Post-analysis of learned rules. AAAI-96, 828-834. Google Scholar
Digital Library
- Liu, B., Hsu, W. and Chen, S. 1997. Using general impressions to analyze discovered classification rules. KDD-97, 31-36.Google Scholar
- Liu, B., Hsu, W. and Ma, Y. 1998. Building an accurate classifier using association rules. Technical report.Google Scholar
- Mahta, M., Agrawal, R. and Rissanen, J. 1996. SLIQ: A fast scalable classifier for data mining. Proc. of the fifth Int'l Conference on Extending Database Technology. Google Scholar
Digital Library
- Merz, C. J, and Murphy, P. 1996. UCI repository of machine learning database. [http://www.cs.uci.edu/~mlearn/MLRepository.html].Google Scholar
- Michalski, R. 1980. Pattern recognition as rule-guided induction inference. IEEE Transaction On Pattern Analysis and Machine Intelligence 2, 349-361. Google Scholar
Digital Library
- Murphy, P. and Pazzani, M. 1994. Exploring the decision forest: an empirical investigation of Occam's razor in decision tree induction. J. of AI Research 1:257-275. Google Scholar
Digital Library
- Pazzani, M., Mani, S. and Shankle, W. R. 1997. Beyond concise and colorful: learning intelligible rules. KDD-97.Google Scholar
- Quinlan, J. R. 1992. C4.5: program for machine learning. Morgan Kaufmann. Google Scholar
Digital Library
- Quinlan, R. and Cameron-Jones, M. 1995. Oversearching and layered search in empirical learning. IJCAI-95. Google Scholar
Digital Library
- Schlimmer, J 1993. Efficiently inducing determinations: a complete and systematic search algorithm that uses optimal pruning. ICML-93, 268-275.Google Scholar
- Srikant, R. and Agrawal, R. 1996. Mining quantitative association rules in large relational tables. SIGMOD-96. Google Scholar
Digital Library
- Wang, K., Tay, W. and Liu, B. 1998. An interestingness-based interval merger for numeric association rules. KDD-98.Google Scholar
- Yoda, K., Fukuda, T. Morimoto, Y. Morishita, S. and Tokuyama, T. 1997. Computing optimized rectilinear regions for association rules. KDD-97.Google Scholar



Comments