High-dimensional Graphs and Variable Selection with the Lasso by Nicolai Meinshausen
نویسنده
چکیده
The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs. Neighborhood selection estimates the conditional independence restrictions separately for each node in the graph and is hence equivalent to variable selection for Gaussian linear models. We show that the proposed neighborhood selection scheme is consistent for sparse high-dimensional graphs. Consistency hinges on the choice of the penalty parameter. The oracle value for optimal prediction does not lead to a consistent neighborhood estimate. Controlling instead the probability of falsely joining some distinct connectivity components of the graph, consistent estimation for sparse graphs is achieved (with exponential rates), even when the number of variables grows as the number of observations raised to an arbitrary power.
منابع مشابه
High Dimensional Graphs and Variable Selection with the Lasso
The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensio...
متن کاملLASSO - TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH - DIMENSIONAL DATA By Nicolai Meinshausen and Bin
UC Berkeley The Lasso is an attractive technique for regularization and variable selection for high-dimensional data, where the number of predictor variables pn is potentially much larger than the number of samples n. However, it was recently discovered that the sparsity pattern of the Lasso estimator can only be asymptotically identical to the true sparsity pattern if the design matrix satisfi...
متن کاملA Note on the Lasso for Gaussian Graphical Model Selection
Inspired by the success of the Lasso for regression analysis (Tibshirani, 1996), it seems attractive to estimate the graph of a multivariate normal distribution by `1-norm penalised likelihood maximisation. The objective function is convex and the graph estimator can thus be computed efficiently, even for very large graphs. However, we show in this note that the resulting estimator is not consi...
متن کاملStability selection
Estimation of structure, such as in variable selection, graphical modelling or cluster analysis is notoriously difficult, especially for high-dimensional data. We introduce stability selection. It is based on subsampling in combination with (high-dimensional) selection algorithms. As such, the method is extremely general and has a very wide range of applicability. Stability selection provides f...
متن کاملRelaxed Lasso
The Lasso is an attractive regularisation method for high dimensional regression. It combines variable selection with an efficient computational procedure. However, the rate of convergence of the Lasso is slow for some sparse high dimensional data, where the number of predictor variables is growing fast with the number of observations. Moreover, many noise variables are selected if the estimato...
متن کامل