منابع مشابه
Error-Based Pruning of Decision Trees Grown on Very Large Data Sets Can Work!
It has been asserted that, using traditional pruning methods, growing decision trees with increasingly larger amounts of training data will result in larger tree sizes even when accuracy does not increase. With regard to error-based pruning, the experimental data used to illustrate this assertion have apparently been obtained using the default setting for pruning strength; in particular, using ...
متن کاملRedescription Mining Over non-Binary Data Sets Using Decision Trees
Scientific data mining is aimed to extract useful information from huge data sets with the help of computational efforts. Recently, scientists encounter an overload of data which describe domain entities from different sides. Many of them provide alternative means to organize information. And every alternative data set offers a different perspective onto the studied problem. Redescription minin...
متن کاملOverprvning Large Decision Trees
This paper presents empirical evidence for five hypotheses about learning from large noisy domains: that trees built from very large training sets are larger and more accurate than trees built from even large subsets; that this increased accuracy is only in part due to the extra size of the trees; and that the extra training instances allow both better choices of attribute while building the tr...
متن کاملEfficient Hierarchical Clustering of Large Data Sets Using P-trees
Hierarchical clustering methods have attracted much attention by giving the user a maximum amount of flexibility. Rather than requiring parameter choices to be predetermined, the result represents all possible levels of granularity. In this paper a hierarchical method is introduced that is fundamentally related to partitioning methods, such as k-medoids and k-means as well as to a density based...
متن کاملDecision Tree Learning on Very Large Data Sets
Consider a labeled data set of 1 terabyte in size. A salient subset might depend upon the users interests. Clearly, browsing such a large data set to find interesting areas would be very time consuming. An intelligent agent which, for a given class of user, could provide hints on areas of the data that might interest the user would be very useful. Given large data sets having categories of sali...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Uluslararası Muhendislik Arastirma ve Gelistirme Dergisi
سال: 2021
ISSN: 1308-5514
DOI: 10.29137/umagd.763490