Improved Accuracy for Decision Tree Algorithm Based on Unsupervised Discretization
Ihsan A. Kareem, Mehdi G. Duaimi?
Journal Title:International Journal of Computer Science and Mobile Computing - IJCSMC
A decision tree is an important classification technique in data mining classification. Decision trees have proved to be valuable tools for the classification, description, and generalization of data. Work on building decision trees for data sets exists in multiple disciplines such as signal processing, pattern recognition, decision theory, statistics, machine learning and artificial neural networks. This paper deals with the problem of finding the parameter settings of decision tree algorithm in order to build an accurate tree. The proposed technique is an unsupervised filter. The suggested discretization applies on C4.5 algorithm to construct a decision tree. The improvement on C4.5 algorithm includes two phases: the first phase is discretization all continuous attributes instead of dealing with numerical values. The second phase is constructing a decision tree model and evaluates performance. It has been experimented on three data sets. All those data files are picked up from the popular UCI the (University of California at Irvine) data repository. The results obtained from experiments show C4.5 after discretization better than C4.5 before discretization.