Families of splitting criteria for classification trees
نویسنده
چکیده
Several splitting criteria for binary classification trees are shown to be written as weighted sums of two values of divergence measures. This weighted sum approach is then used to form two families of splitting criteria. One of them contains the chi-squared and entropy criterion, the other contains the mean posterior improvement criterion. Both family members are shown to have the property of exclusive preference. Furthermore, the optimal splits based on the proposed families are studied. We find that the best splits depend on the parameters in the families. The results reveal interesting differences among various criteria. Examples are given to demonstrate the usefulness of both families.
منابع مشابه
Selecting the best categorical split for classification trees
Based on a family of splitting criteria for classification trees, methods of selecting the best categorical splits are studied. They are shown to be very useful in reducing the computational complexity of the exhaustive search method. Keyword: classification tree; power divergence; splitting criteria
متن کاملNew Splitting Criteria for Decision Trees in Stationary Data Streams.
The most popular tools for stream data mining are based on decision trees. In previous 15 years, all designed methods, headed by the very fast decision tree algorithm, relayed on Hoeffding's inequality and hundreds of researchers followed this scheme. Recently, we have demonstrated that although the Hoeffding decision trees are an effective tool for dealing with stream data, they are a purely h...
متن کاملA Splitting Criteria Based on Similarity in Decision Tree Learning
Decision trees are considered to be the most effective and widely used data mining technique for classification, their representation is intuitive and generally easy to be comprehended by humans. The most critical issue in the learning process of decision trees is the splitting criteria. In this paper, we firstly provide the definition of similarity computation that usually used in data cluster...
متن کاملA Multi-Criteria Evaluation approach to Delineation of Suitable Areas for Planting Trees (Case Study: Juglans regia in Gharnaveh Watershed of Golestan Province)
For the successful tree establishment, an evaluation of land suitability is necessary.In this paper, we demonstrate how to implement fuzzy classification of land suitability in aGIS environment for afforestation with Juglans regia in Gharnaveh Watershed of GolestanProvince in Iran. Juglans regia is one of the most important agro-forestry species in manyrural parts of Iran. Relevant criteria for...
متن کاملDecision Trees for Ranking: Effect of new smoothing methods, new splitting criteria and simple pruning methods
In this work we investigate several issues in order to improve the performance of probabilistic estimation trees (PETs). First, we derive a new probability smoothing that takes into account the class distributions of all the nodes from the root to each leaf. This enhances probability estimations with respect to other previous approaches without smoothing or with Laplace correction. Secondly, we...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Statistics and Computing
دوره 9 شماره
صفحات -
تاریخ انتشار 1999