Boosting for High-multivariate Responses in High-dimensional Linear Regression

نویسندگان

  • Roman Werner Lutz
  • Peter Bühlmann
  • PETER BÜHLMANN
چکیده

We propose a boosting method, multivariate L2Boosting, for multivariate linear regression based on some squared error loss for multivariate data. It can be applied to multivariate linear regression with continuous responses and to vector autoregressive time series. We prove, for i.i.d. as well as time series data, that multivariate L2Boosting can consistently recover sparse high-dimensional multivariate linear functions, even when the number of predictor variables pn and the dimension of the response qn grow almost exponentially with sample size n, pn = qn = O(exp(Cn )) (0 < ξ < 1, 0 < C < ∞), but the `1-norm of the true underlying function is finite. Our theory seems to be among the first to address the issue of large dimension of the response variable; the relevance of such settings is briefly outlined. We also identify empirically some cases where our multivariate L2Boosting is better than multiple application of univariate methods to single response components, thus demonstrating that the multivariate approach can be very useful.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multivariate Boosting for Integrative Analysis of High-Dimensional Cancer Genomic Data

In this paper, we propose a novel multivariate component-wise boosting method for fitting multivariate response regression models under the high-dimension, low sample size setting. Our method is motivated by modeling the association among different biological molecules based on multiple types of high-dimensional genomic data. Particularly, we are interested in two applications: studying the inf...

متن کامل

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

Prioritization sub-watershed of Acemangar Basin in Chaharmahal-e- Bakhtiari for soil and water management using morphometric parameters and ensemble of TOPSIS-multivariate linear regression algorithm

Sub-watershed prioritization is very important in natural resources and watershed management. This study deals with prioritization of sub-watersheds using a mixed multivariate linear model of New TOPSIS-Regression over morphometric parameters of 11 sub-watersheds. Morphometric parameters include constant of compression ratio, roundness factor, form ratio, slenderness ratio,channel maintenance, ...

متن کامل

Methods for regression analysis in high-dimensional data

By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...

متن کامل

Regularization for generalized additive mixed models by likelihood-based boosting.

OBJECTIVE With the emergence of semi- and nonparametric regression the generalized linear mixed model has been extended to account for additive predictors. However, available fitting methods fail in high dimensional settings where many explanatory variables are present. We extend the concept of boosting to generalized additive mixed models and present an appropriate algorithm that uses two diff...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006