Selecting One Dependency Estimators in Bayesian Network Using Different MDL Scores and Overfitting Criterion
نویسندگان
چکیده
The Averaged One Dependency Estimator (AODE) is integrated all possible Super-Parent-One-Dependency Estimators (SPODEs) and estimates class conditional probabilities by averaging them. In an AODE network some redundant SPODEs maybe result in some bias of classifiers, as a consequence, it could reduce the classification accuracy substantially. In this paper, a kind of MDL metrics is used to select SPODEs in a whole or partially, therefore there are three different classifiers presented. The performance comparisons between them and AODE have been shown not only the theoretical analyses are reasonable, but also efficient and effective. And Mean Square Error (MSE) is used to test overfitting. Experiential results have indicated that the classifier using MDL score metrics had better performance than original AODE, and at the same time, has less overfitting. At the end of the paper, further discussions and verifications of some properties of overfitting have also shown in the experiments.
منابع مشابه
How Good Is Crude MDL for Solving the Bias-Variance Dilemma? An Empirical Investigation Based on Bayesian Networks
The bias-variance dilemma is a well-known and important problem in Machine Learning. It basically relates the generalization capability (goodness of fit) of a learning method to its corresponding complexity. When we have enough data at hand, it is possible to use these data in such a way so as to minimize overfitting (the risk of selecting a complex model that generalizes poorly). Unfortunately...
متن کاملCalculating the Nml Distribution for Tree-structured Bayesian Networks
We are interested in model class selection. We want to compute a criterion which, given two competing model classes, chooses the better one. When learning Bayesian network structures from sample data, an important issue is how to evaluate the goodness of alternative network structures. Perhaps the most commonly used model (class) selection criterion is the marginal likelihood, which is obtained...
متن کاملScoring functions for learning Bayesian networks
The aim of this work is to benchmark scoring functions used by Bayesian network learning algorithms in the context of classification. We considered both information-theoretic scores, such as LL, AIC, BIC/MDL, NML and MIT, and Bayesian scores, such as K2, BD, BDe and BDeu. We tested the scores in a classification task by learning the optimal TAN classifier with benchmark datasets. We conclude th...
متن کاملModel selection based on Bayesian predictive densities and multiple data records
Bayesian predictive densities are used to derive model selection rules. The resulting rules hold for sets of data records where each record is composed of an unknown number of deterministic signals common to all the records and a stationary white Gaussian noise. To determine the correct model, the set of data records is partitioned into two disjoint subsets. One of the subsets is used for estim...
متن کاملOn the importance of using treewidth as a model-selection criterion for learning Bayesian networks
This paper is motivated by the desire to learn Bayesian networks that allow efficient inference. Traditionally, model selection criteria such as BIC/MDL focus on learning Bayesian networks that fit the data and have low representation complexity (i.e. the number of parameters needed to specify the network). However, these criteria do not take into account the complexity of inference in the resu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Inf. Sci. Eng.
دوره 30 شماره
صفحات -
تاریخ انتشار 2014