Dependent Multinomial Models Made Easy: Stick-Breaking with the Polya-gamma Augmentation
نویسندگان
چکیده
Many practical modeling problems involve discrete data that are best represented as draws from multinomial or categorical distributions. For example, nucleotides in a DNA sequence, children’s names in a given state and year, and text documents are all commonly modeled with multinomial distributions. In all of these cases, we expect some form of dependency between the draws: the nucleotide at one position in the DNA strand may depend on the preceding nucleotides, children’s names are highly correlated from year to year, and topics in text may be correlated and dynamic. These dependencies are not naturally captured by the typical Dirichlet-multinomial formulation. Here, we leverage a logistic stick-breaking representation and recent innovations in Pólya-gamma augmentation to reformulate the multinomial distribution in terms of latent variables with jointly Gaussian likelihoods, enabling us to take advantage of a host of Bayesian inference techniques for Gaussian models with minimal overhead.
منابع مشابه
A Stick-Breaking Likelihood for Categorical Data Analysis with Latent Gaussian Models
The development of accurate models and efficient algorithms for the analysis of multivariate categorical data are important and longstanding problems in machine learning and computational statistics. In this paper, we focus on modeling categorical data using Latent Gaussian Models (LGMs). We propose a novel logistic stick-breaking likelihood function for categorical LGMs that can exploit recent...
متن کاملA Stick-Breaking Likelihood for Categorical Data Analysis with Latent Gaussian Models
The development of accurate models and efficient algorithms for the analysis of multivariate categorical data are important and longstanding problems in machine learning and computational statistics. In this paper, we focus on modeling categorical data using Latent Gaussian Models (LGMs). We propose a novel stick-breaking likelihood function for categorical LGMs that exploits accurate linear an...
متن کاملBayesian Analysis of Dynamic Linear Topic Models
In dynamic topic modeling, the proportional contribution of a topic to a document depends on the temporal dynamics of that topic’s overall prevalence in the corpus. We extend the Dynamic Topic Model of Blei and Lafferty (2006) by explicitly modeling document-level topic proportions with covariates and dynamic structure that includes polynomial trends and periodicity. A Markov Chain Monte Carlo ...
متن کاملGamma Processes, Stick-Breaking, and Variational Inference
While most Bayesian nonparametric models in machine learning have focused on the Dirichlet process, the beta process, or their variants, the gamma process has recently emerged as a useful nonparametric prior in its own right. Current inference schemes for models involving the gamma process are restricted to MCMC-based methods, which limits their scalability. In this paper, we present a variatio...
متن کاملSparse Bayes estimation in non-Gaussian models via data augmentation
In this paper we provide a data-augmentation scheme that unifies many common sparse Bayes estimators into a single class. This leads to simple iterative algorithms for estimating the posterior mode under arbitrary combinations of likelihoods and priors within the class. The class itself is quite large: for example, it includes quantile regression, support vector machines, and logistic and multi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015