Farthest Centroids Divisive Clustering ∗ Haw - ren

نویسندگان

Haw-ren Fang

Yousef Saad

چکیده

A method is presented to partition a given set of data entries embedded in Euclidean space by recursively bisecting clusters into smaller ones. The initial set is subdivided into two subsets whose centroids are farthest from each other, and the process is repeated recursively on each subset. The bisection task can be formulated as an integer programming problem, which is NP-hard. Instead, an approximate algorithm based on a spectral approach is given. Experimental evidence shows that the clustering method often outperforms a standard spectral clustering method, but at a higher computational cost. The paper also discusses how to improve the standard K-means algorithm, a successful clustering method that is sensitive to initialization. It is shown that the quality of clustering resulting from the K-means technique can be enhanced by using the proposed algorithm for its initialization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hybrid Hierarchical Clustering: an Experimental Analysis

In this paper, we present a hybrid clustering method that combines the divisive hierarchical clustering with the agglomerative hierarchical clustering. We used the bisect K-means divisive clustering algorithm in our method. First, we cluster the document collection using bisect K-means clustering algorithm with K’ > K as the total number of clusters. Second, we calculate the centroids of K’ clu...

متن کامل

Divisive Hierarchical Clustering with K-means and Agglomerative Hierarchical Clustering

To implement divisive hierarchical clustering algorithm with K-means and to apply Agglomerative Hierarchical Clustering on the resultant data in data mining where efficient and accurate result. In Hierarchical Clustering by finding the initial k centroids in a fixed manner instead of randomly choosing them. In which k centroids are chosen by dividing the one dimensional data of a particular clu...

متن کامل

Optimizing K-Means by Fixing Initial Cluster Centers

Data mining techniques help in business decision making and predicting behaviors and future trends. Clustering is a data mining technique used to make groups of objects that are somehow similar in characteristics. Clustering analyzes data objects without consulting a known class label or category i.e. it is an unsupervised data mining technique. Kmeans is a widely used partitional clustering al...

متن کامل

Algorithms for Clustering Molecular Dynamics Confogurations

Two traditional clustering algorithms are applied to configurations from a long molecular dynamics trajectory and compared using two sets of test data. First, a subset of atoms was chosen to present conformations which naturally fall into a number of clusters. Second, a subset of atoms was selected to span a relatively continuous region of conformational space rather than form discrete conforma...

متن کامل