Discussion of “ Least Angle Regression ”

نویسنده

  • Hemant Ishwaran
چکیده

Being able to reliably, and automatically, select variables in linear regression models is a notoriously difficult problem. This research attacks this question head on, introducing not only a computationally efficient algorithm and method, LARS (and its derivatives), but at the same time introducing comprehensive theory explaining the intricate details of the procedure as well as theory to guide its practical implementation. This is a fascinating paper and I commend the authors for this important work. Automatic variable selection, the main theme of this paper, has many goals. So before embarking upon a discussion of the paper it is important to first sit down and clearly identify what the objectives are. The authors make it clear in their introduction that, while often the goal in variable selection is to select a " good " linear model, where goodness is measured in terms of prediction accuracy performance, it is also important at the same time to choose models which lean toward the parsimonious side. So here the goals are pretty clear: we want good prediction error performance but also simpler models. These are certainly reasonable objectives and quite justifiable in many scientific settings. At the same, however, one should recognize the difficulty of the task, as the two goals, low prediction error and smaller models, can be diametrically opposed. By this I mean that certainly from an oracle point of view it is true that minimizing prediction error will identify the true model, and thus, by going after prediction error (in a perfect world), we will also get smaller models by default. However, in practice, what happens is that small gains in prediction error often translate into larger models and less dimension reduction. So as procedures get better at reducing prediction error, they can also get worse at picking out variables accurately. Unfortunately, I have some misgivings that LARS might be falling into this trap. Mostly my concern is fueled by the fact that Mallows' C p is the criterion used for determining the optimal LARS model. The use of C p

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Summary and discussion of: “Exact Post-selection Inference for Forward Stepwise and Least Angle Regression”

In this report we summarize the recent paper [Taylor et al., 2014] which proposes new inference tools for methods that perform variable selection and estimation in an adaptive regression. Although this paper mainly studies forward stepwise regression (FS) and least angle regression (LAR), the approach in this paper is not limited to these cases. This paper describes how to carry out exact infer...

متن کامل

Discussion of “ Least Angle Regression ” by Efron

Algorithms for simultaneous shrinkage and selection in regression and classification provide attractive solutions to knotty old statistical challenges. Nevertheless, as far as we can tell, Tibshirani’s Lasso algorithm has had little impact on statistical practice. Two particular reasons for this may be the relative inefficiency of the original Lasso algorithm and the relative complexity of more...

متن کامل

Discussion of Least Angle Regression

Algorithms for simultaneous shrinkage and selection in regression and classification provide attractive solutions to knotty old statistical challenges. Nevertheless, as far as we can tell, Tibshirani’s Lasso algorithm has had little impact on statistical practice. Two particular reasons for this may be the relative inefficiency of the original Lasso algorithm, and the relative complexity of mor...

متن کامل

Regression Modeling for Spherical Data via Non-parametric and Least Square Methods

Introduction Statistical analysis of the data on the Earth's surface was a favorite subject among many researchers. Such data can be related to animal's migration from a region to another position. Then, statistical modeling of their paths helps biological researchers to predict their movements and estimate the areas that are most likely to constitute the presence of the animals. From a geome...

متن کامل

Least angle and l 1 penalized regression : A review ∗ †

Least Angle Regression is a promising technique for variable selection applications, offering a nice alternative to stepwise regression. It provides an explanation for the similar behavior of LASSO (l1-penalized regression) and forward stagewise regression, and provides a fast implementation of both. The idea has caught on rapidly, and sparked a great deal of research interest. In this paper, w...

متن کامل

Fuzzy Hybrid least-Squares Regression Approach to Estimating the amount of Extra Cellular Recombinant Protein A from Escherichia coli BL21

Introduction: Immune Protein A is a component with a vast spectrum of biochemical, biological and medical usages. The coding gene of this protein was extracted from Staphylococcus aureus and was cloned and expressed in Escherichia coli bacteria. Suitable statistical methods are utilized to optimize expression conditions  for evaluating experiment accuracy , guarantee the accuracy of subsequent ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004