Gradient lasso for Cox proportional hazards model

نویسندگان

  • Insuk Sohn
  • Jinseog Kim
  • Sin-Ho Jung
  • Changyi Park
چکیده

MOTIVATION There has been an increasing interest in expressing a survival phenotype (e.g. time to cancer recurrence or death) or its distribution in terms of a subset of the expression data of a subset of genes. Due to high dimensionality of gene expression data, however, there is a serious problem of collinearity in fitting a prediction model, e.g. Cox's proportional hazards model. To avoid the collinearity problem, several methods based on penalized Cox proportional hazards models have been proposed. However, those methods suffer from severe computational problems, such as slow or even failed convergence, because of high-dimensional matrix inversions required for model fitting. We propose to implement the penalized Cox regression with a lasso penalty via the gradient lasso algorithm that yields faster convergence to the global optimum than do other algorithms. Moreover the gradient lasso algorithm is guaranteed to converge to the optimum under mild regularity conditions. Hence, our gradient lasso algorithm can be a useful tool in developing a prediction model based on high-dimensional covariates including gene expression data. RESULTS Results from simulation studies showed that the prediction model by gradient lasso recovers the prognostic genes. Also results from diffuse large B-cell lymphoma datasets and Norway/Stanford breast cancer dataset indicate that our method is very competitive compared with popular existing methods by Park and Hastie and Goeman in its computational time, prediction and selectivity. AVAILABILITY R package glcoxph is available at http://datamining.dongguk.ac.kr/R/glcoxph.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending the Iterative Convex Minorant Algorithm to the Cox Model for Interval Censored Data

The iterative convex minorant ICM algorithm Groeneboom and Wellner is fast in computing the NPMLE of the distribution function for interval censored data without covariates We reformulate the ICM as a generalized gradient projection method GGP which leads to a natural extension to the Cox model It is also easily extended to support the Lasso Tibshirani Some simulation results are also shown For...

متن کامل

L1 penalized estimation in the Cox proportional hazards model.

This article presents a novel algorithm that efficiently computes L(1) penalized (lasso) estimates of parameters in high-dimensional models. The lasso has the property that it simultaneously performs variable selection and shrinkage, which makes it very useful for finding interpretable prediction rules in high-dimensional data. The new algorithm is based on a combination of gradient ascent opti...

متن کامل

Extending the Iterative Convex Minorant Algorithm to the Cox Model

The iterative convex minorant (ICM) algorithm (Groeneboom and Wellner, 1992) is fast in computing the NPMLE of the distribution function for interval censored data without covariates. We reformulate the ICM as a generalized gradient projection method (GGP), which leads to a natural extension to the Cox model. It is also easily extended to support the Lasso (Tibshirani, 1996). Some simulation re...

متن کامل

The evaluation of Cox and Weibull proportional hazards models and their applications to identify factors influencing survival time in acute leukem

Introduction: The most important models used in analysis of survival data is proportional hazards models. Applying this model requires establishment of the relevance proportional hazards assumption, otherwise it world lead to incorrect inference. This study aims to evaluate Cox and Weibull models which are used in identification of effective factors on survival time in acute leukemia. Me...

متن کامل

A proposal for variable selection in the Cox model

We propose a new method for variable selection and estimation in Cox's proportional hazards model. Our proposal minimizes the log partial likelihood subject to the sum of the absolute values of the parameters being bounded by a constant. Because of the nature of this constraint it tends to produce some coeecients that are exactly zero and hence gives interpretable models. The method is a variat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 25 14  شماره 

صفحات  -

تاریخ انتشار 2009