K-Subspace Clustering
نویسندگان
چکیده
The widely used K-means clustering deals with ball-shaped (spherical Gaussian) clusters. In this paper, we extend the K-means clustering to accommodate extended clusters in subspaces, such as lineshaped clusters, plane-shaped clusters, and ball-shaped clusters. The algorithm retains much of the K-means clustering flavors: easy to implement and fast to converge. A model selection procedure is incorporated to determine the cluster shape. As a result, our algorithm can recognize a wide range of subspace clusters studied in various literatures, and also the global ball-shaped clusters (living in all dimensions). We carry extensive experiments on both synthetic and real-world datasets, and the results demonstrate the effectiveness of our algorithm.
منابع مشابه
A Mutual Subspace Clustering Algorithm for High Dimensional Datasets
Generation of consistent clusters is always an interesting research issue in the field of knowledge and data engineering. In real applications, different similarity measures and different clustering techniques may be adopted in different clustering spaces. In such a case, it is very difficult or even impossible to define an appropriate similarity measure and clustering criteria in the union spa...
متن کاملA Robust k-Means Type Algorithm for Soft Subspace Clustering and Its Application to Text Clustering
Soft subspace clustering are effective clustering techniques for high dimensional datasets. Although several soft subspace clustering algorithms have been developed in recently years, its robustness should be further improved. In this work, a novel soft subspace clustering algorithm RSSKM are proposed. It is based on the incorporation of the alternative distance metric into the framework of kme...
متن کاملDiscriminative K-means for Clustering
We present a theoretical study on the discriminative clustering framework, recently proposed for simultaneous subspace selection via linear discriminant analysis (LDA) and clustering. Empirical results have shown its favorable performance in comparison with several other popular clustering algorithms. However, the inherent relationship between subspace selection and clustering in this framework...
متن کاملK-Subspace clustering and its application in sparse component analysis
The K-subspace clustering algorithm is established for sparse component analysis and overcome the difficulty that conventional SCA algorithms can not overcome. The conventional SCA algorithm can only perform single dominant SCA, can not perform multiple dominant SCA, but the proposed SCA algorithm based on K-subspace clustering can overcome this difficulty.
متن کاملA Text Clustering System based on k-means Type Subspace Clustering and Ontology
This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For unders...
متن کاملClustering in applications with multiple data sources - A mutual subspace clustering approach
In many applications, such as bioinformatics and cross-market customer relationship management, there are data from multiple sources jointly describing the same set of objects. An important data mining task is to find interesting groups of objects that form clusters in subspaces of the data sources jointly supported by those data sources. In this paper, we study a novel problem of mining mutual...
متن کامل