Genetically oriented clustering using variable length chromosomes

نویسنده

  • N. A. Aspragathos
چکیده

In most cases, due to the plethora of data values, it is unrealistic for domain experts to mine useful knowledge from the database. Motivated by this, a novel approach for optimized clustering is developed in this paper. The proposed approach is genetically oriented to mine vital information incorporated in large databases avoiding entrapment in local optima and sensitivity to initialization. The aim of the proposed method is to find an optimum set of clusters that can properly classify all training data without much computational burden. The contribution is twofold: firstly, it gleans the valuable information hidden behind the database and secondly, it evolves automatically the appropriate number of cluster centers, as well as the partitioning of the data, without a priori assumptions on the cluster centers. In this paper, the effectiveness of a Genetic Algorithm with variable length chromosomes is demonstrated for clustering data sets into an unknown number of clusters. The flexibility of the proposed variable length Genetic Algorithm to detect the optimum number of clusters and the corresponding partition is evaluated through various experimental tests. The results of the proposed algorithm are compared with those obtained by the wellknown fuzzy c-means algorithm, which is applicable only for a predefined fixed number of clusters and the subtractive clustering method, where data-points are considered as candidate cluster centers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient and exact maximum likelihood quantisation of genomic features using dynamic programming

An efficient and exact dynamic programming algorithm is introduced to quantise a continuous random variable into a discrete random variable that maximises the likelihood of the quantised probability distribution for the original continuous random variable. Quantisation is often useful before statistical analysis and modelling of large discrete network models from observations of multiple contin...

متن کامل

Genetic Algorithm for Clustering, Finding the Number of Clusters

In this paper a genetic algorithm for clustering is proposed. The algorithm is based on the variable length chromosomes and the notion of local points density in the clustered set. Its role is to identify the number of clusters in the clustered set and to partition this set into particular clusters. The tests were conducted for two different sets of two dimensional data. The algorithm performed...

متن کامل

Nonparametric Genetic Clustering: Comparison of Validity Indices

Variable string length genetic algorithm (GA) is used for developing a novel nonparametric clustering technique when the number of clusters is not fixed a priori. Chromosomes in the same population may now have different lengths since they encode different number of clusters. The crossover operator is redefined to tackle the concept of variable string length. Cluster validity index is used as a...

متن کامل

A multiobjective spatial fuzzy clustering algorithm for image segmentation

This article describes a multiobjective spatial fuzzy clustering algorithm for image segmentation. To obtain satisfactory segmentation performance for noisy images, the proposed method introduces the non-local spatial information derived from the image into fitness functions which respectively consider the global fuzzy compactness and fuzzy separation among the clusters. After producing the set...

متن کامل

Chromosome-length polymorphism in fungi.

The examination of fungal chromosomes by pulsed-field gel electrophoresis has revealed that length polymorphism is widespread in both sexual and asexual species. This review summarizes characteristics of fungal chromosome-length polymorphism and possible mitotic and meiotic mechanisms of chromosome length change. Most fungal chromosome-length polymorphisms are currently uncharacterized with res...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009