Correlation Pursuit: Forward Stepwise Variable Selection for Index Models.
نویسندگان
چکیده
In this article, a stepwise procedure, correlation pursuit (COP), is developed for variable selection under the sufficient dimension reduction framework, in which the response variable Y is influenced by the predictors X(1), X(2), …, X(p) through an unknown function of a few linear combinations of them. Unlike linear stepwise regression, COP does not impose a special form of relationship (such as linear) between the response variable and the predictor variables. The COP procedure selects variables that attain the maximum correlation between the transformed response and the linear combination of the variables. Various asymptotic properties of the COP procedure are established, and in particular, its variable selection performance under diverging number of predictors and sample size has been investigated. The excellent empirical performance of the COP procedure in comparison with existing methods are demonstrated by both extensive simulation studies and a real example in functional genomics.
منابع مشابه
Note for Alan’s Class
Wavelets have proven to be immensely useful for signal analysis and representation [7]. Various dictionaries of wavelets have been designed for different types of signals or function spaces [3, 13]. Two key factors underlying the successes of wavelets are the sparsity of the representation and the efficiency of the analysis. Specifically, a signal can typically be represented by a linear superp...
متن کاملPrediction of the adsorption capability onto activated carbon of liquid aliphatic alcohols using molecular fragments method
Quantitative structure-property relationship (QSPR) for estimating the adsorption of aliphatic alcohols onto activated carbon were developed using substructural molecular fragments (SMF) method. The adsorption capacity of activated carbon (gr/100grC) for 150 aliphatic alcohols onto activated carbon (AC) is studied under equilibrium conditions. Forward and backwards stepwise regression variable ...
متن کاملForward Selection and Estimation in High Dimensional Single Index Models
We propose a new variable selection and estimation technique for high dimensional single index models with unknown monotone smooth link function. Among many predictors, typically, only a small fraction of them have significant impact on prediction. In such a situation, more interpretable models with better prediction accuracy can be obtained by variable selection. In this article, we propose a ...
متن کاملGreedy and Relaxed Approximations to Model Selection: A simulation study
The Minimum Description Length (MDL) principle is an important tool for retrieving knowledge from data as it embodies the scientific strife for simplicity in describing the relationship among variables. As MDL and other model selection criteria penalize models on their dimensionality, the estimation problem involves a combinatorial search over subsets of predictors and quickly becomes computati...
متن کاملAutomated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality.
OBJECTIVES Automated variable selection methods are frequently used to determine the independent predictors of an outcome. The objective of this study was to determine the reproducibility of logistic regression models developed using automated variable selection methods. STUDY DESIGN AND SETTING An initial set of 29 candidate variables were considered for predicting mortality after acute myoc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of the Royal Statistical Society. Series B, Statistical methodology
دوره 74 5 شماره
صفحات -
تاریخ انتشار 2012