Semi-Supervised Dimensionality Reduction Using Pairwise Equivalence Constraints
نویسندگان
چکیده
To deal with the problem of insufficient labeled data, usually side information – given in the form of pairwise equivalence constraints between points – is used to discover groups within data. However, existing methods using side information typically fail in cases with high-dimensional spaces. In this paper, we address the problem of learning from side information for high-dimensional data. To this end, we propose a semi-supervised dimensionality reduction scheme that incorporates pairwise equivalence constraints for finding a better embedding space, which improves the performance of subsequent clustering and classification phases. Our method builds on the assumption that points in a sufficiently small neighborhood tend to have the same label. Equivalence constraints are employed to modify the neighborhoods and to increase the separability of different classes. Experimental results on high-dimensional image data sets show that integrating side information into the dimensionality reduction improves the clustering and classification performance.
منابع مشابه
Semi-supervised Sparsity Pairwise Constraint Preserving Projections based on GA
The deficiency of the ability for preserving global geometric structure information of data is the main problem of existing semi-supervised dimensionality reduction with pairwise constraints. A dimensionality reduction algorithm called Semi-supervised Sparsity Pairwise Constraint Preserving Projections based on Genetic Algorithm (SSPCPPGA) is proposed. On the one hand, the algorithm fuses unsup...
متن کاملSemi-supervised Gaussian process latent variable model with pairwise constraints
In machine learning, Gaussian process latent variable model (GP-LVM) has been extensively applied in the field of unsupervised dimensionality reduction. When some supervised information, e.g., pairwise constraints or labels of the data, is available, the traditional GP-LVM cannot directly utilize such supervised information to improve the performance of dimensionality reduction. In this case, i...
متن کاملSemi-Supervised Dimensionality Reduction
Dimensionality reduction is among the keys in mining highdimensional data. This paper studies semi-supervised dimensionality reduction. In this setting, besides abundant unlabeled examples, domain knowledge in the form of pairwise constraints are available, which specifies whether a pair of instances belong to the same class (must-link constraints) or different classes (cannot-link constraints)...
متن کاملConstraint-based sparsity preserving projections and its application on face recognition
Aiming at the deficiency of supervise information in the process of sparse reconstruction in Sparsity Preserving Projections (SPP), a semi-supervised dimensionality reduction method named Constraint-based Sparsity Preserving Projections (CSPP) is proposed. CSPP attempts to make use of supervision information of must-link constraints and cannot-link constraints to adjust the sparse reconstructiv...
متن کاملRank canonical correlation analysis and its application in visual search reranking
Ranking relevance degree information is widely utilized in the ranking models of information retrieval applications, such as text and multimedia retrieval, question answering, and visual search reranking. However, existing feature dimensionality reduction methods neglect this kind of valuable potential supervised information. In this paper, we extend the pairwise constraints from the traditiona...
متن کامل