Maximum Margin Principal Components

نویسندگان

  • Xianghui Luo
  • Robert J. Durrant
چکیده

Principal Component Analysis (PCA) is a very successful dimensionality reduction technique, widely used in predictive modeling. A key factor in its widespread use in this domain is the fact that the projection of a dataset onto its first K principal components minimizes the sum of squared errors between the original data and the projected data over all possible rank K projections. Thus, PCA provides optimal low-rank representations of data for least-squares linear regression under standard modeling assumptions. On the other hand, when the loss function for a prediction problem is not the least-squares error, PCA is typically a heuristic choice of dimensionality reduction – in particular for classification problems under the zero-one loss. In this paper we target classification problems by proposing a straightforward alternative to PCA that aims to minimize the difference in margin distribution between the original and the projected data. Extensive experiments show that our simple approach typically outperforms PCA on any particular dataset, in terms of classification error, though this difference is not always statistically significant, and despite being a filter method is frequently competitive with Partial Least Squares (PLS) and Lasso on a wide range of datasets. [email protected] [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Infinite Markov-Switching Maximum Entropy Discrimination Machines

In this paper, we present a method that combines the merits of Bayesian nonparametrics, specifically stick-breaking priors, and largemargin kernel machines in the context of sequential data classification. The proposed model employs a set of (theoretically) infinite interdependent large-margin classifiers as model components, that robustly capture local nonlinearity of complex data. The employe...

متن کامل

Bayesian Maximum Margin Principal Component Analysis

Supervised dimensionality reduction has shown great advantages in finding predictive subspaces. Previous methods rarely consider the popular maximum margin principle and are prone to overfitting to usually small training data, especially for those under the maximum likelihood framework. In this paper, we present a posterior-regularized Bayesian approach to combine Principal Component Analysis (...

متن کامل

Generalized Low Rank Models a Dissertation Submitted to the Institute for Computational and Mathematical Engineering and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. This dissertation extends the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well known techniques in data analysis, such as nonnegative matrix factorization, matrix compl...

متن کامل

Compression of Breast Cancer Images By Principal Component Analysis

The principle of dimensionality reduction with PCA is the representation of the dataset ‘X’in terms of eigenvectors ei ∈ RN  of its covariance matrix. The eigenvectors oriented in the direction with the maximum variance of X in RN carry the most      relevant information of X. These eigenvectors are called principal components [8]. Ass...

متن کامل

Weighted Maximum Variance Dimensionality Reduction

Dimensionality reduction procedures such as principal component analysis and the maximum margin criterion discriminant are special cases of a weighted maximum variance (WMV) approach. We present a simple two parameter version of WMV that we call 2P-WMV. We study the classification error given by the 1-nearest neighbor algorithm on features extracted by our and other dimensionality reduction met...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1705.06371  شماره 

صفحات  -

تاریخ انتشار 2017