Improving Decision Trees by Clustering
نویسنده
چکیده
Multi-modal classification problems arise in many fields and form an important class of problems. The presence of disjoint areas for each class creates special problems for techniques that cannot partition each class into more than one region. Among the various techniques that have been applied with some success to multi-modal problems are decision tree classifiers (DTCs) and back propagation neural networks. DTCs use a recursive partitioning algorithm to partition the feature space into disjoint areas. In principle, DTCs should perform well on multi-modal data sets because they have the ability to partition each class into disjoint areas. In practice, however, DTCs can have significant difficulty with such data sets. In the presence of multimodality decision trees can became very bushy leading to high error rates. We show that clustering of the data before classification can lead to simpler trees and lower error rates.
منابع مشابه
Improving Accuracy of Decision Trees Using Clustering Techniques
Data mining is an important part of information management technology. Simply put, it is a method to extract and analyze meaningful patterns and correlations in a large relational database. In Data mining, Decision trees are one of the most worldwide used tools for decision support. In the emerging area of Data mining applications, users of data mining tools are faced with the problem of data s...
متن کاملImproving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering
Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...
متن کاملImproving Accuracy using different Data Mining Algorithms
Mining large data set is an important issue to deal with as data is growing as the field grows. Today, crime rate is a menace that each country faces. With the increase in crime rate the data is increasing and it is such a critical field that accuracy is important at the same time. This paper shows the comparison in the results between clustering and the classification. K means is used in clust...
متن کاملTree approximation for discrete time stochastic processes: a process distance approach
Approximating stochastic processes by scenario trees is important in decision analysis. In this paper we focus on improving the approximation quality of trees by smaller, tractable trees. In particular we propose and analyze an iterative algorithm to construct improved approximations: given a stochastic process in discrete time and starting with an arbitrary, approximating tree, the algorithm i...
متن کاملData Mining Approach to the Sunspot Classification Problem
Sunspots are the subject of interest to many astronomers and solar physicists. Sunspot observation, analysis and classification form an important part of furthering the knowledge about the Sun. Sunspot classification is a manual and thus very labor intensive process that could be automated if it can be successfully learned by a machine. This paper deals with data mining approaches to the proble...
متن کامل