Farthest Centroids Divisive Clustering ∗ Haw - ren
نویسندگان
چکیده
A method is presented to partition a given set of data entries embedded in Euclidean space by recursively bisecting clusters into smaller ones. The initial set is subdivided into two subsets whose centroids are farthest from each other, and the process is repeated recursively on each subset. The bisection task can be formulated as an integer programming problem, which is NP-hard. Instead, an approximate algorithm based on a spectral approach is given. Experimental evidence shows that the clustering method often outperforms a standard spectral clustering method, but at a higher computational cost. The paper also discusses how to improve the standard K-means algorithm, a successful clustering method that is sensitive to initialization. It is shown that the quality of clustering resulting from the K-means technique can be enhanced by using the proposed algorithm for its initialization.
منابع مشابه
Hybrid Hierarchical Clustering: an Experimental Analysis
In this paper, we present a hybrid clustering method that combines the divisive hierarchical clustering with the agglomerative hierarchical clustering. We used the bisect K-means divisive clustering algorithm in our method. First, we cluster the document collection using bisect K-means clustering algorithm with K’ > K as the total number of clusters. Second, we calculate the centroids of K’ clu...
متن کاملDivisive Hierarchical Clustering with K-means and Agglomerative Hierarchical Clustering
To implement divisive hierarchical clustering algorithm with K-means and to apply Agglomerative Hierarchical Clustering on the resultant data in data mining where efficient and accurate result. In Hierarchical Clustering by finding the initial k centroids in a fixed manner instead of randomly choosing them. In which k centroids are chosen by dividing the one dimensional data of a particular clu...
متن کاملOptimizing K-Means by Fixing Initial Cluster Centers
Data mining techniques help in business decision making and predicting behaviors and future trends. Clustering is a data mining technique used to make groups of objects that are somehow similar in characteristics. Clustering analyzes data objects without consulting a known class label or category i.e. it is an unsupervised data mining technique. Kmeans is a widely used partitional clustering al...
متن کاملAlgorithms for Clustering Molecular Dynamics Confogurations
Two traditional clustering algorithms are applied to configurations from a long molecular dynamics trajectory and compared using two sets of test data. First, a subset of atoms was chosen to present conformations which naturally fall into a number of clusters. Second, a subset of atoms was selected to span a relatively continuous region of conformational space rather than form discrete conforma...
متن کاملAlgorithms for Clustering Molecular Dynamics Configurations
Two traditional clustering algorithms are applied to configurations from a long molecular dynamics trajectory and compared using two sets of test data. First, a subset of atoms was chosen to present conformations which naturally fall into a number of clusters. Second, a subset of atoms was selected to span a relatively continuous region of conformational space rather than form discrete conforma...
متن کامل