Learning Bayesian Network Models from Incomplete Data using Importance Sampling
نویسندگان
چکیده
We propose a Bayesian approach to learning Bayesian network models from incomplete data. The objective is to obtain the posterior distribution of models, given the observed part of the data. We describe a new algorithm, called eMC, to simulate draws from this posterior distribution. One of the new ideas in our algorithm is to use importance sampling to approximate the posterior distribution of models given the observed data and the current imputation model. The importance sampler is constructed by defining an approximate predictive distribution for the unobserved part of the data. In this way existing (heuristic) imputation methods can be used that don’t require exact inference for generating imputations. We illustrate eMC by its application to modeling the risk factors of coronary heart disease. In the experiments we consider different missing data mechanisms and different fractions of missing data.
منابع مشابه
The modeling of body's immune system using Bayesian Networks
In this paper, the urinary infection, that is a common symptom of the decline of the immune system, is discussed based on the well-known algorithms in machine learning, such as Bayesian networks in both Markov and tree structures. A large scale sampling has been executed to evaluate the performance of Bayesian network algorithm. A number of 4052 samples wereobtained from the database of the Tak...
متن کاملA Permutation Genetic Algorithm For Variable Ordering In Learning Bayesian Networks From Data
Greedy score-based algorithms for learning the structure of Bayesian networks may produce very different models depending on the order in which variables are scored. These models often vary significantly in quality when applied to inference. Unfortunately, finding the optimal ordering of inputs entails search through the permutation space of variables. Furthermore, in real-world applications of...
متن کاملThe Variational Bayesian EM Algorithm for Incomplete Data: with Application to Scoring Graphical Model Structures
We present an efficient procedure for estimating the marginal likelihood of probabilistic models with latent variables or incomplete data. This method constructs and optimises a lower bound on the marginal likelihood using variational calculus, resulting in an iterative algorithm which generalises the EM algorithm by maintaining posterior distributions over both latent variables and parameters....
متن کاملImportance Sampling on Relational Bayesian Networks
We present techniques for importance sampling from distributions defined by Relational Bayesian Networks. The methods operate directly on the abstract representation language, and therefore can be applied in situations where sampling from a standard Bayesian Network representation is infeasible. We describe experimental results from using standard, adaptive and backward sampling strategies. Fur...
متن کاملLearning Deep Generative Models with Doubly Stochastic MCMC
We present doubly stochastic gradient MCMC, a simple and generic method for (approximate) Bayesian inference of deep generative models in the collapsed continuous parameter space. At each MCMC sampling step, the algorithm randomly draws a minibatch of data samples to estimate the gradient of log-posterior and further estimates the intractable expectation over latent variables via a Gibbs sample...
متن کامل