Decision Trees Using the Minimum Entropy-of-Error Principle

نویسندگان

  • Joaquim Marques de Sá
  • João Gama
  • Raquel Sebastião
  • Luís A. Alexandre
چکیده

Binary decision trees based on univariate splits have traditionally employed so-called impurity functions as a means of searching for the best node splits. Such functions use estimates of the class distributions. In the present paper we introduce a new concept to binary tree design: instead of working with the class distributions of the data we work directly with the distribution of the errors originated by the node splits. Concretely, we search for the best splits using a minimum entropy-of-error (MEE) strategy. This strategy has recently been applied in other areas (e.g. regression, clustering, blind source separation, neural network training) with success. We show that MEE trees are capable of producing good results with often simpler trees, have interesting generalization properties and in the many experiments we have performed they could be used without pruning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uncertainty Measures of Rough Set Prediction

The main statistics used in rough set data analysis, the approximation quality, is of limited value when there is a choice of competing models for predicting a decision variable. In keeping within the rough set philosophy of non–invasive data analysis, we present three model selection criteria, using information theoretic entropy in the spirit of the minimum description length principle. Our ma...

متن کامل

Estimating Suspended Sediment by Artificial Neural Network (ANN), Decision Trees (DT) and Sediment Rating Curve (SRC) Models (Case study: Lorestan Province, Iran)

The aim of this study was to estimate suspended sediment by the ANN model, DT with CART algorithm and different types of SRC, in ten stations from the Lorestan Province of Iran. The results showed that the accuracy of ANN with Levenberg-Marquardt back propagation algorithm is more than the two other models, especially in high discharges. Comparison of different intervals in models showed that r...

متن کامل

ISAR Image Improvement Using STFT Kernel Width Optimization Based On Minimum Entropy Criterion

Nowadays, Radar systems have many applications and radar imaging is one of the most important of these applications. Inverse Synthetic Aperture Radar (ISAR) is used to form an image from moving targets. Conventional methods use Fourier transform to retrieve Doppler information. However, because of maneuvering of the target, the Doppler spectrum becomes time-varying and the image is blurred. Joi...

متن کامل

Combination of Evidence Using the Principle of Minimum Information Gain

450 One of the most important aspects in any treatment of uncertain information is the rule of combination for updating the degrees of uncertainty. The theory of belief functions uses the Dempster rule to combine two belief functions defined by independent bodies of evidence. However, with limited dependency information about the accumulated belief the Dempster rule may lead to unsatisfactory r...

متن کامل

Inferring Hierarchical Clustering Structures by Deterministic Annealing

The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an objective function for tree...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009