Discretization for naive-Bayes learning: managing discretization bias and variance
نویسندگان
چکیده
منابع مشابه
Non-Disjoint Discretization for Naive-Bayes Classifiers
Previous discretization techniques have discretized numeric attributes into disjoint intervals. We argue that this is neither necessary nor appropriate for naive-Bayes classifiers. The analysis leads to a new discretization method, Non-Disjoint Discretization (NDD). NDD forms overlapping intervals for a numeric attribute, always locating a value toward the middle of an interval to obtain more r...
متن کاملProportional k-Interval Discretization for Naive-Bayes Classifiers
This paper argues that two commonly-used discretization approaches, fixed k-interval discretization and entropy-based discretization have sub-optimal characteristics for naive-Bayes classification. This analysis leads to a new discretization method, Proportional k-Interval Discretization (PKID), which adjusts the number and size of discretized intervals to the number of training instances, thus...
متن کاملSSV Criterion Based Discretization for Naive Bayes Classifiers
Decision tree algorithms deal with continuous variables by finding split points which provide best separation of objects belonging to different classes. Such criteria can also be used to augment methods which require or prefer symbolic data. A tool for continuous data discretization based on the SSV criterion (designed for decision trees) has been constructed. It significantly improves the perf...
متن کاملWeighted Proportional k-Interval Discretization for Naive-Bayes Classifiers
The use of different discretization techniques can be expected to affect the classification bias and variance of naive-Bayes classifiers. We call such an effect discretization bias and variance. Proportional kinterval discretization (PKID) tunes discretization bias and variance by adjusting discretized interval size and number proportional to the number of training instances. Theoretical analys...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2008
ISSN: 0885-6125,1573-0565
DOI: 10.1007/s10994-008-5083-5