Entropy based classification trees
نویسندگان
چکیده
One method for building classification trees is to choose split variables by maximising expected entropy. This can be extended through the application of imprecise probability by replacing instances of expected entropy with the maximum possible expected entropy over credal sets of probability distributions. Such methods may not take full advantage of the opportunities offered by imprecise probability theory. In this paper, we change focus from maximum possible expected entropy to the full range of expected entropy. We present an entropy minimisation algorithm using the non–parametric inference approach to multinomial data. We also present an interval comparison method based on two user–chosen parameters, which includes previously presented splitting criteria (maximum entropy and entropy interval dominance) as special cases. This method is then applied to 13 datasets, and the various possible values of the two user–chosen criteria are compared with regard to each other, and to the entropy maximisation criteria which our approach generalises.
منابع مشابه
Performance evaluation of classification trees for building detection from aerial images and lidar data: a comparison of classification trees models
This study assesses the performance of three classification trees (CT) models (Entropy, Gain Ratio and Gini) for building detection by the fusion of laser scanner data and multi-spectral aerial images. Data from four study areas with different sensors and scene characteristics were used to assess the performance of the models. The process of performance evaluation is based on four criteria: mod...
متن کاملApplication of Different Methods of Decision Tree Algorithm for Mapping Rangeland Using Satellite Imagery (Case Study: Doviraj Catchment in Ilam Province)
Using satellite imagery for the study of Earth's resources is attended by manyresearchers. In fact, the various phenomena have different spectral response inelectromagnetic radiation. One major application of satellite data is the classification ofland cover. In recent years, a number of classification algorithms have been developed forclassification of remote sensing data. One of the most nota...
متن کاملInformation entropy and the classification of local anaesthetics
Algorithms for classification and taxonomy based on criteria such as information entropy and its production are proposed. As an example, the feasibility of replacing a given anaesthetic by similar ones in the composition of a complex drug is studied. Some local anaesthetics currently in use are classified using characteristic chemical properties of different portions of their molecules. Many cl...
متن کاملApplication of classification trees-J48 to model the presence of roach (Rutilus rutilus) in rivers
In the present study, classification trees (CTs-J48 algorithm) were used to study the occurrence of roach in rivers in Flanders (Belgium). The presence/absence of roach was modelled based on a set of river characteristics. The predictive performance of the CTs models was assessed based on the percentage of Correctly Classified Instances (CCI) and Cohen's kappa statistics. To find the best model...
متن کاملPeriodic Classification of Local Anaesthetics (Procaine Analogues)
Algorithms for classification are proposed based on criteria (information entropy and its production). The feasibility of replacing a given anaesthetic by similar ones in the composition of a complex drug is studied. Some local anaesthetics currently in use are classified using characteristic chemical properties of different portions of their molecules. Many classification algorithms are based ...
متن کامل