GAMs with integrated model selection using penalized regression splines and applications to environmental modelling
نویسندگان
چکیده
Generalized Additive Models (GAMs) have been popularized by the work of Hastie and Tibshirani (1990) and the availability of user friendly GAM software in Splus. However, whilst it is flexible and efficient, the GAM framework based on backfitting with linear smoothers presents some difficulties when it comes to model selection and inference. On the other hand, the mathematically elegant work of Wahba (1990) and co-workers on Generalized Spline Smoothing (GSS) provides a rigorous framework for model selection (Gu and Wahba, 1991) and inference with GAMs constructed from smoothing splines: but unfortunately these models are computationally very expensive with operations counts that are of cubic order in the number of data. A “middle way” between these approaches is to construct GAMs using penalized regression splines (see e.g. Wahba 1980, 1990; Eilers and Marx 1998, Wood 2000). In this paper we develop this idea and show how GAMs constructed using penalized regression splines can be used to get most of the practical benefits of GSS models, including well founded model selection and multi-dimensional smooth terms, with the ease of use and low computational cost of backfit GAMs. Inference with the resulting methods also requires slightly fewer approximations than are employed in the GAM modelling software provided in Splus. This paper presents the basic mathematical and numerical approach to GAMs implemented in the R package mgcv, and includes two environmental examples using the methods as implemented in the package.
منابع مشابه
Thin plate regression splines
I discuss the production of low rank smoothers for d ≥ 1 dimensional data, which can be fitted by regression or penalized regression methods. The smoothers are constructed by a simple transformation and truncation of the basis that arises from the solution of the thinplate spline smoothing problem, and are optimal in the sense that the truncation is designed to result in the minimum possible pe...
متن کاملGeneralized additive models in business and economics
The paper presents applications of a class of semi-parametric models called generalized additive models (GAMs) to several business and economic datasets. Applications include analysis of wage-education relationship, brand choice, and number of trips to a doctor’s office. The dependent variable may be continuous, categorical or count. These semiparametric models are flexible and robust extension...
متن کاملOn confidence intervals for GAMs based on penalized regression splines
Generalized additive models represented using penalized regression splines, estimated by penalized likelihood maximisation and with smoothness selected by generalized cross validation or similar criteria, provide a computationally efficient general framework for practical smooth modelling. Various authors have proposed approximate Bayesian interval estimates for such models, based on extensions...
متن کاملA penalized framework for distributed lag non-linear models.
Distributed lag non-linear models (DLNMs) are a modelling tool for describing potentially non-linear and delayed dependencies. Here, we illustrate an extension of the DLNM framework through the use of penalized splines within generalized additive models (GAM). This extension offers built-in model selection procedures and the possibility of accommodating assumptions on the shape of the lag struc...
متن کاملPenalized Bregman Divergence Estimation via Coordinate Descent
Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...
متن کامل