Robust Clustering in Arbitrarily Oriented Subspaces
نویسندگان
چکیده
In this paper, we propose an efficient and effective method to find arbitrarily oriented subspace clusters by mapping the data space to a parameter space defining the set of possible arbitrarily oriented subspaces. The objective of a clustering algorithm based on this principle is to find those among all the possible subspaces, that accommodate many database objects. In contrast to existing approaches, our method can find subspace clusters of different dimensionality even if they are sparse or are intersected by other clusters within a noisy environment. A broad experimental evaluation demonstrates the robustness, efficiency and effectivity of our method.
منابع مشابه
Hierarchical Subspace Clustering
It is well-known that traditional clustering methods considering all dimensions of the feature space usually fail in terms of efficiency and effectivity when applied to high-dimensional data. This poor behavior is based on the fact that clusters may not be found in the high-dimensional feature space, although clusters exist in subspaces of the feature space. To overcome these limitations of tra...
متن کاملGlobal Correlation Clustering Based on the Hough Transform
In this article, we propose an efficient and effective method for finding arbitrarily oriented subspace clusters by mapping the data space to a parameter space defining the set of possible arbitrarily oriented subspaces. The objective of a clustering algorithm based on this principle is to find those among all the possible subspaces that accommodate many database objects. In contrast to existin...
متن کاملHard-LOST: Modified k-Means for Oriented Lines
Robust clustering of data into linear subspaces is a common problem. Here we treat clustering into one-dimensional subspaces that cross the origin. This problem arises in blind source separation, where the subspaces correspond directly to columns of a mixing matrix. We present an algorithm that identifies these subspaces using a modified k-means procedure, where line orientations and distances ...
متن کاملSONAR: Signal De-mixing for Robust Correlation Clustering
Clustering is one of the most fundamental challenges in data mining. We identified three core problems which turn finding a natural grouping of a data set into a difficult task: First, clusters may exist in arbitrarily oriented subspaces of various dimensionality (also known as correlation clusters). Secondly, the cluster structure may be hidden by noise and outliers. Finally, the number, size ...
متن کاملLearning Transformations for Clustering and Classification Learning Transformations for Clustering and Classification
A low-rank transformation learning framework for subspace clustering and classification is here proposed. Many high-dimensional data, such as face images and motion sequences, approximately lie in a union of low-dimensional subspaces. The corresponding subspace clustering problem has been extensively studied in the literature to partition such highdimensional data into clusters corresponding to...
متن کامل