Efficient Approximation of Cross-Validation for Kernel Methods using Bouligand Influence Function
نویسندگان
چکیده
Model selection is one of the key issues both in recent research and application of kernel methods. Cross-validation is a commonly employed and widely accepted model selection criterion. However, it requires multiple times of training the algorithm under consideration, which is computationally intensive. In this paper, we present a novel strategy for approximating the cross-validation based on the Bouligand influence function (BIF), which only requires the solution of the algorithm once. The BIF measures the impact of an infinitesimal small amount of contamination of the original distribution. We first establish the link between the concept of BIF and the concept of cross-validation. The BIF is related to the first order term of a Taylor expansion. Then, we calculate the BIF and higher order BIFs, and apply these theoretical results to approximate the cross-validation error in practice. Experimental results demonstrate that our approximate cross-validation criterion is sound and efficient.
منابع مشابه
Efficient Optimization of the Parameters of LS-SVM for Regression versus Cross-Validation Error
Least Squares Support Vector Machines (LS-SVM) are the state of the art in kernel methods for regression and function approximation. In the last few years, these models have been successfully applied to time series modelling and prediction. A key issue for the good performance of a LS-SVM model are the values chosen for both the kernel parameters and its hyperparameters in order to avoid overfi...
متن کاملModel Selection for Kernel Probit Regression
The convex optimisation problem involved in fitting a kernel probit regression (KPR) model can be solved efficiently via an iteratively re-weighted least-squares (IRWLS) approach. The use of successive quadratic approximations of the true objective function suggests an efficient approximate form of leave-one-out cross-validation for KPR, based on an existing exact algorithm for the weighted lea...
متن کاملBouligand Derivatives and Robustness of Support Vector Machines for Regression
We investigate robustness properties for a broad class of support vector machines with non-smooth loss functions. These kernel methods are inspired by convex risk minimization in infinite dimensional Hilbert spaces. Leading examples are the support vector machine based on the ε-insensitive loss function, and kernel based quantile regression based on the pinball loss function. Firstly, we propos...
متن کاملApproximate Regularization Paths for `2-loss Support Vector Machines
We consider approximate regularization paths for kernel methods and in particular `2-loss Support Vector Machines (SVMs). We provide a simple and efficient framework for maintaining an εapproximate solution (and a corresponding ε-coreset) along the entire regularization path. We prove correctness and also practical efficiency our method. Unlike previous algorithms our algorithm does not need an...
متن کاملPrediction of Surface Water Supply Sources for the District of Columbia Using Least Squares Support Vector Machines (LS-SVM) Method
In this research, we developed a predictive model based on least squares support vector machine (LS-SVM) that forecasts the future streamflow discharge using the past streamflow discharge data. A Gaussian Radial Basis Function (RBF) kernel framework was built on the data set to tune the kernel parameters and regularization constants of the model with respect to the given performance measure. Th...
متن کامل