Obtaining better quality final clustering by merging a collection of clusterings

نویسندگان

  • Selim Mimaroglu
  • Ertunc Erdil
چکیده

MOTIVATION Clustering methods including k-means, SOM, UPGMA, DAA, CLICK, GENECLUSTER, CAST, DHC, PMETIS and KMETIS have been widely used in biological studies for gene expression, protein localization, sequence recognition and more. All these clustering methods have some benefits and drawbacks. We propose a novel graph-based clustering software called COMUSA for combining the benefits of a collection of clusterings into a final clustering having better overall quality. RESULTS COMUSA implementation is compared with PMETIS, KMETIS and k-means. Experimental results on artificial, real and biological datasets demonstrate the effectiveness of our method. COMUSA produces very good quality clusters in a short amount of time. AVAILABILITY http://www.cs.umb.edu/∼smimarog/comusa CONTACT [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

انتخاب اعضای ترکیب در خوشه‌بندی ترکیبی با استفاده از رأی‌گیری

Clustering is the process of division of a dataset into subsets that are called clusters, so that objects within a cluster are similar to each other and different from objects of the other clusters. So far, a lot of algorithms in different approaches have been created for the clustering. An effective choice (can combine) two or more of these algorithms for solving the clustering problem. Ensemb...

متن کامل

Weighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering

Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...

متن کامل

Combining multiple clusterings using similarity graph

Multiple clusterings are produced for various needs and reasons in both distributed and local environments. Combining multiple clusterings into a final clustering which has better overall quality has gained importance recently. It is also expected that the final clustering is novel, robust, and scalable. In order to solve this challenging problem we introduce a new graph-based method. Our metho...

متن کامل

Consensus Based Ensembles of Soft Clusterings

Cluster Ensembles is a framework for combining multiple partitionings obtained from separate clustering runs into a final consensus clustering. This framework has attracted much interest recently because of its numerous practical applications, and a variety of approaches including Graph Partitioning, Maximum Likelihood, Genetic algorithms, and Voting-Merging have been proposed. The vast majorit...

متن کامل

Clustering Learning Objects Collections Using Cluster Ensembles

Learning Object Repositories are increasingly being used in learning systems to provide high-quality, reusable educational materials. A relevant data mining problem associated with the automatic categorization of learning objects is the discovery of intrinsic classes based on the textual contents of the meta-data records. In this paper, we present a cluster ensemble method, that is applicable f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 26 20  شماره 

صفحات  -

تاریخ انتشار 2010