Stochastic Inference for Scalable Probabilistic Modeling of Binary Matrices
نویسندگان
چکیده
Fully observed large binary matrices appear in a wide variety of contexts. To model them, probabilistic matrix factorization (PMF) methods are an attractive solution. However, current batch algorithms for PMF can be inefficient because they need to analyze the entire data matrix before producing any parameter updates. We derive an efficient stochastic inference algorithm for PMF models of fully observed binary matrices. Our method exhibits faster convergence rates than more expensive batch approaches and has better predictive performance than scalable alternatives. The proposed method includes new data subsampling strategies which produce large gains over standard uniform subsampling. We also address the task of automatically selecting the size of the minibatches of data used by our method. For this, we derive an algorithm that adjusts this hyper-parameter online.
منابع مشابه
Comparative Study of Random Matrices Capability in Uncertainty Detection of Pier’s Dynamics
Because of random nature of many dependent variables in coastal engineering, treatment of effective parameters is generally associated with uncertainty. Numerical models are often used for dynamic analysis of complex structures, including mechanical systems. Furthermore, deterministic models are not sufficient for exact anticipation of structure’s dynamic response, but probabilistic models...
متن کاملScalable Statistical Relational Learning for NLP
Prerequisites: No prior knowledge of statistical relational learning is required. Abstract: Statistical Relational Learning (SRL) is an interdisciplinary research area that combines firstorder logic and machine learning methods for probabilistic inference. Although many Natural Language Processing (NLP) tasks (including text classification, semantic parsing, information extraction, coreferenc...
متن کاملProbabilistic Recurrent State-Space Models
State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g., LSTMs) proved extremely successful in modeling complex timeseries data. Fully probabilistic SSMs, however, unfortunately often prove hard to train, even for smaller problems. To overcome this limitation, we propose a scalabl...
متن کاملModeling Overlapping Communities with Node Popularities
We develop a probabilistic approach for accurate network modeling using node popularities within the framework of the mixed-membership stochastic blockmodel (MMSB). Our model integrates two basic properties of nodes in social networks: homophily and preferential connection to popular nodes. We develop a scalable algorithm for posterior inference, based on a novel nonconjugate variant of stochas...
متن کاملA Variational Analysis of Stochastic Gradient Algorithms
Stochastic Gradient Descent (SGD) is an important algorithm in machine learning. With constant learning rates, it is a stochastic process that, after an initial phase of convergence, generates samples from a stationary distribution. We show that SGD with constant rates can be effectively used as an approximate posterior inference algorithm for probabilistic modeling. Specifically, we show how t...
متن کامل