Probit Normal Correlated Topic Models
نویسندگان
چکیده
The logistic normal distribution has recently been adapted via the transformation of multivariate Gaussian variables to model the topical distribution of documents in the presence of correlations among topics. In this paper, we propose a probit normal alternative approach to modelling correlated topical structures. Our use of the probit model in the context of topic discovery is novel, as many authors have so far concentrated solely of the logistic model partly due to the formidable inefficiency of the multinomial probit model even in the case of very small topical spaces. We herein circumvent the inefficiency of multinomial probit estimation by using an adaptation of the diagonal orthant multinomial probit in the topic models context, resulting in the ability of our topic modelling scheme to handle corpuses with a large number of latent topics. An additional and very important benefit of our method lies in the fact that unlike with the logistic normal model whose non-conjugacy leads to the need for sophisticated sampling schemes, our approach exploits the natural conjugacy inherent in the auxiliary formulation of the probit model to achieve greater simplicity. The application of our proposed scheme to a well known Associated Press corpus not only helps discover a large number of meaningful topics but also reveals the capturing of compellingly intuitive correlations among certain topics. Besides, our proposed approach lends itself to even further scalability thanks to various existing high performance algorithms and architectures capable of handling millions of documents.
منابع مشابه
Modelling of Correlated Ordinal Responses, by Using Multivariate Skew Probit with Different Types of Variance Covariance Structures
In this paper, a multivariate fundamental skew probit (MFSP) model is used to model correlated ordinal responses which are constructed from the multivariate fundamental skew normal (MFSN) distribution originate to the greater flexibility of MFSN. To achieve an appropriate VC structure for reaching reliable statistical inferences, many types of variance covariance (VC) structures are considered ...
متن کاملThe Analysis of Bayesian Probit Regression of Binary and Polychotomous Response Data
The goal of this study is to introduce a statistical method regarding the analysis of specific latent data for regression analysis of the discrete data and to build a relation between a probit regression model (related to the discrete response) and normal linear regression model (related to the latent data of continuous response). This method provides precise inferences on binary and multinomia...
متن کاملParameterization of multivariate random effects models for categorical data.
Alternative parameterizations and problems of identification and estimation of multivariate random effects models for categorical responses are investigated. The issues are illustrated in the context of the multivariate binomial logit-normal (BLN) model introduced by Coull and Agresti (2000, Biometrics 56, 73-80). We demonstrate that the BLN model is poorly identified unless proper restrictions...
متن کاملMixture of Normals Probit Models
This paper generalizes the normal probit model of dichotomous choice by introducing mixtures of normals distributions for the disturbance term. By mixing on both the mean and variance parameters and by increasing the number of distributions in the mixture these models effectively remove the normality assumption and are much closer to semiparametric models. When a Bayesian approach is taken, the...
متن کاملProbit-Based Traffic Assignment: A Comparative Study between Link-Based Simulation Algorithm and Path-Based Assignment and Generalization to Random-Coefficient Approach
Probabilistic approach of traffic assignment has been primarily developed to provide a more realistic and flexible theoretical framework to represent traveler’s route choice behavior in a transportation network. The problem of path overlapping in network modelling has been one of the main issues to be tackled. Due to its flexible covariance structure, probit model can adequately address the pro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1410.0908 شماره
صفحات -
تاریخ انتشار 2014