Covariance Operator Based Dimensionality Reduction with Extension to Semi-Supervised Settings
نویسندگان
چکیده
We consider the task of dimensionality reduction for regression (DRR) informed by realvalued multivariate labels. The problem is often treated as a regression task where the goal is to find a low dimensional representation of the input data that preserves the statistical correlation with the targets. Recently, Covariance Operator Inverse Regression (COIR) was proposed as an effective solution that exploits the covariance structures of both input and output. COIR addresses known limitations of recent DRR techniques and allows a closed-form solution without resorting to explicit output space slicing often required by existing IR-based methods. In this work we provide a unifying view of COIR and other DRR techniques and relate them to the popular supervised dimensionality reduction methods including the canonical correlation analysis (CCA) and the linear discriminant analysis (LDA). We then show that COIR can be effectively extended to a semi-supervised learning setting where many of the input points lack their corresponding multivariate targets. A study of benefits of proposed approaches is presented on several important regression problems in both fullysupervised and semi-supervised settings.
منابع مشابه
Semi-supervised Laplacian Regularization of Kernel Canonical Correlation Analysis
Kernel canonical correlation analysis (KCCA) is a dimensionality reduction technique for paired data. By finding directions that maximize correlation, KCCA learns representations that are more closely tied to the underlying semantics of the data rather than noise. However, meaningful directions are not only those that have high correlation to another modality, but also those that capture the ma...
متن کاملClassification by semi-supervised discriminative regularization
Linear discriminant analysis (LDA) is a well-known dimensionality reduction method which can be easily extended for data classification. Traditional LDA aims to preserve the separability of different classes and the compactness of the same class in the output space by maximizing the between-class covariance and simultaneously minimizing the within-class covariance. However, the performance of L...
متن کاملSemi-supervised classification based on random subspace dimensionality reduction
Graph structure is vital to graph based semi-supervised learning. However, the problem of constructing a graph that reflects the underlying data distribution has been seldom investigated in semi-supervised learning, especially for high dimensional data. In this paper, we focus on graph construction for semisupervised learning and propose a novel method called Semi-Supervised Classification base...
متن کاملOn Computational Issues of Semi-Supervised Local Fisher Discriminant Analysis
Dimensionality reduction is one of the important preprocessing steps in practical pattern recognition. SEmi-supervised Local Fisher discriminant analysis (SELF)— which is a semi-supervised and local extension of Fisher discriminant analysis—was shown to work excellently in experiments. However, when data dimensionality is very high, a naive use of SELF is prohibitive due to high computational c...
متن کاملSemi-supervised Sparsity Pairwise Constraint Preserving Projections based on GA
The deficiency of the ability for preserving global geometric structure information of data is the main problem of existing semi-supervised dimensionality reduction with pairwise constraints. A dimensionality reduction algorithm called Semi-supervised Sparsity Pairwise Constraint Preserving Projections based on Genetic Algorithm (SSPCPPGA) is proposed. On the one hand, the algorithm fuses unsup...
متن کامل