Variable selection and estimation in generalized linear models with the seamless L0 penalty.
نویسندگان
چکیده
In this paper, we propose variable selection and estimation in generalized linear models using the seamless L0 (SELO) penalized likelihood approach. The SELO penalty is a smooth function that very closely resembles the discontinuous L0 penalty. We develop an e cient algorithm to fit the model, and show that the SELO-GLM procedure has the oracle property in the presence of a diverging number of variables. We propose a Bayesian Information Criterion (BIC) to select the tuning parameter. We show that under some regularity conditions, the proposed SELO-GLM/BIC procedure consistently selects the true model. We perform simulation studies to evaluate the finite sample performance of the proposed methods. Our simulation studies show that the proposed SELO-GLM procedure has a better finite sample performance than several existing methods, especially when the number of variables is large and the signals are weak. We apply the SELO-GLM to analyze a breast cancer genetic dataset to identify the SNPs that are associated with breast cancer risk.
منابع مشابه
The Florida State University College of Arts and Sciences Theories on Group Variable Selection in Multivariate Regression Models
We study group variable selection on multivariate regression model. Group variable selection is selecting the non-zero rows of coefficient matrix, since there are multiple response variables and thus if one predictor is irrelevant to estimation then the corresponding row must be zero. In a high dimensional setup, shrinkage estimation methods are applicable and guarantee smaller MSE than OLS acc...
متن کاملVariable Selection and Estimation with the Seamless-l0 Penalty
Penalized least squares procedures that directly penalize the number of variables in a regression model (L0 penalized least squares procedures) enjoy nice theoretical properties and are intuitively appealing. On the other hand, L0 penalized least squares methods also have significant drawbacks in that implementation is NP-hard and computationally unfeasible when the number of variables is even ...
متن کاملPenalized Bregman Divergence Estimation via Coordinate Descent
Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...
متن کاملA Majorization-minimization Approach to Variable Selection Using Spike and Slab Priors
We develop a method to carry out MAP estimation for a class of Bayesian regression models in which coefficients are assigned with Gaussian-based spike and slab priors. The objective function in the corresponding optimization problem has a Lagrangian form in that regression coefficients are regularized by a mixture of squared l2 and l0 norms. A tight approximation to the l0 norm using majorizati...
متن کاملVariable Selection via Penalized Likelihood
Variable selection is vital to statistical data analyses. Many of procedures in use are ad hoc stepwise selection procedures, which are computationally expensive and ignore stochastic errors in the variable selection process of previous steps. An automatic and simultaneous variable selection procedure can be obtained by using a penalized likelihood method. In traditional linear models, the best...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- The Canadian journal of statistics = Revue canadienne de statistique
دوره 40 4 شماره
صفحات -
تاریخ انتشار 2012