Reduced Set KPCA for Improving the Training and Execution Speed of Kernel Machines
نویسندگان
چکیده
This paper presents a practical, and theoretically wellfounded, approach to improve the speed of kernel manifold learning algorithms relying on spectral decomposition. Utilizing recent insights in kernel smoothing and learning with integral operators, we propose Reduced Set KPCA (RSKPCA), which also suggests an easy-toimplement method to remove or replace samples with minimal effect on the empirical operator. A simple data point selection procedure is given to generate a substitute density for the data, with accuracy that is governed by a user-tunable parameter `. The effect of the approximation on the quality of the KPCA solution, in terms of spectral and operator errors, can be shown directly in terms of the density estimate error and as a function of the parameter `. We show in experiments that RSKPCA can improve both training and evaluation time of KPCA by up to an order of magnitude, and compares favorably to the widely-used Nystrom and density-weighted Nystrom methods.
منابع مشابه
Reduced-Set Kernel Principal Components Analysis for Improving the Training and Execution Speed of Kernel Machines
This paper 1 presents a practical, and theoretically well-founded, approach to improve the speed of kernel manifold learning algorithms relying on spectral decomposition. Utilizing recent insights in kernel smoothing and learning with integral operators, we propose Reduced Set KPCA (RSKPCA), which also suggests an easy-to-implement method to remove or replace samples with minimal effect on the ...
متن کاملKronecker Factorization for Speeding up Kernel Machines
In kernel machines, such as kernel principal component analysis (KPCA), Gaussian Processes (GPs), and Support Vector Machines (SVMs), the computational complexity of finding a solution is O(n), where n is the number of training instances. To reduce this expensive computational complexity, we propose using Kronecker factorization, which approximates a positive definite kernel matrix by the Krone...
متن کاملUnsupervised Nonlinear Feature Extraction Method and Its Effects on Target Detection in High-dimensional Data
The principal component analysis (PCA) is one of the most effective unsupervised techniques for feature extraction. To extract higher order properties of data, researchers extended PCA to kernel PCA (KPCA) by means of kernel machines. In this paper, KPCA is applied as a feature extraction procedure to dimension reduction for target detection as a preprocessing on hyperspectral images. Then the ...
متن کاملA comparative study of performance of K-nearest neighbors and support vector machines for classification of groundwater
The aim of this work is to examine the feasibilities of the support vector machines (SVMs) and K-nearest neighbor (K-NN) classifier methods for the classification of an aquifer in the Khuzestan Province, Iran. For this purpose, 17 groundwater quality variables including EC, TDS, turbidity, pH, total hardness, Ca, Mg, total alkalinity, sulfate, nitrate, nitrite, fluoride, phosphate, Fe, Mn, Cu, ...
متن کاملSeparating Well Log Data to Train Support Vector Machines for Lithology Prediction in a Heterogeneous Carbonate Reservoir
The prediction of lithology is necessary in all areas of petroleum engineering. This means that to design a project in any branch of petroleum engineering, the lithology must be well known. Support vector machines (SVM’s) use an analytical approach to classification based on statistical learning theory, the principles of structural risk minimization, and empirical risk minimization. In this res...
متن کامل