Generalized Pseudolikelihood Methods for Inverse Covariance Estimation
نویسندگان
چکیده
We introduce PseudoNet, a new pseudolikelihood -based estimator of the inverse covariance matrix, that has a number of useful statistical and computational properties. We show, through detailed experiments with synthetic as well as real-world finance and wind power data, that PseudoNet outperforms related methods in terms of estimation error and support recovery, making it well-suited for use in a downstream application, where obtaining low estimation error can be important. We also show, under regularity conditions, that PseudoNet is consistent. Our proof assumes the existence of accurate estimates of the diagonal entries of the underlying inverse covariance matrix; we additionally provide a two-step method to obtain these estimates, even in a high-dimensional setting, going beyond the proofs for related methods. Unlike other pseudolikelihood-based methods, we also show that PseudoNet does not saturate, i.e., in high dimensions, there is no hard limit on the number of nonzero entries in the PseudoNet estimate. We present a fast algorithm as well as screening rules that make computing the PseudoNet estimate over a range of tuning parameters tractable.
منابع مشابه
Asymptotic Efficiency of Deterministic Estimators for Discrete Energy-Based Models: Ratio Matching and Pseudolikelihood
Standard maximum likelihood estimation cannot be applied to discrete energy-based models in the general case because the computation of exact model probabilities is intractable. Recent research has seen the proposal of several new estimators designed specifically to overcome this intractability, but virtually nothing is known about their theoretical properties. In this paper, we present a gener...
متن کاملCommunication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation
Across a variety of scientific disciplines, sparse inverse covariance estimation is a popular tool for capturing the underlying dependency relationships in multivariate data. Unfortunately, most estimators are not scalable enough to handle the sizes of modern highdimensional data sets (often on the order of terabytes), and assume Gaussian samples. To address these deficiencies, we introduce HP-...
متن کاملJoint mean and covariance estimation with unreplicated matrix - variate data ∗
It has been proposed that complex populations, such as those that arise in genomics studies, may exhibit dependencies among observations as well as among variables. This gives rise to the challenging problem of analyzing unreplicated high-dimensional data with unknown mean and dependence structures. Matrixvariate approaches that impose various forms of (inverse) covariance sparsity allow flexib...
متن کاملStructure estimation for discrete graphical models: Generalized covariance matrices and their inverses
We investigate the relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph. Our work extends results that have previously ...
متن کاملOn the Sample Complexity of Robust PCA
We estimate the rate of convergence and sample complexity of a recent robust estimator for a generalized version of the inverse covariance matrix. This estimator is used in a convex algorithm for robust subspace recovery (i.e., robust PCA). Our model assumes a sub-Gaussian underlying distribution and an i.i.d. sample from it. Our main result shows with high probability that the norm of the diff...
متن کامل