Nonparametric Genetic Clustering: Comparison of Validity Indices
نویسندگان
چکیده
Variable string length genetic algorithm (GA) is used for developing a novel nonparametric clustering technique when the number of clusters is not fixed a priori. Chromosomes in the same population may now have different lengths since they encode different number of clusters. The crossover operator is redefined to tackle the concept of variable string length. Cluster validity index is used as a measure of the fitness of a chromosome. The performance of several cluster validity indices, namely, Davies–Bouldin (DB) index, Dunn’s index, two of its generalized versions and a recently developed index, in appropriately partitioning a data set, are compared.
منابع مشابه
Development of An External Cluster Validity Index using Probabilistic Approach and Min-max Distance
Validating a given clustering result is a very challenging task in real world. So for this purpose, several cluster validity indices have been developed in the literature. Cluster validity indices are divided into two main categories: external and internal. External cluster validity indices rely on some supervised information available and internal validity indices utilize the intrinsic structu...
متن کاملTowards a standard methodology to evaluate internal cluster validity indices
The evaluation and comparison of internal cluster validity indices is a critical problem in the clustering area. The methodology used in most of the evaluations assumes that the clustering algorithms work correctly. We propose an alternative methodology that does not make this often false assumption. We compared 7 internal cluster validity indices with both methodologies and concluded that the ...
متن کاملImproved Automatic Clustering Using a Multi-Objective Evolutionary Algorithm With New Validity measure and application to Credit Scoring
In data mining, clustering is one of the important issues for separation and classification with groups like unsupervised data. In this paper, an attempt has been made to improve and optimize the application of clustering heuristic methods such as Genetic, PSO algorithm, Artificial bee colony algorithm, Harmony Search algorithm and Differential Evolution on the unlabeled data of an Iranian bank...
متن کاملA New Validity Measure for Heuristic Possibilistic Clustering
A heuristic approach to possibilistic clustering is the effective tool for the data analysis. The approach is based on the concept of allotment among fuzzy clusters. To establish the number of clusters in a data set, a validity measure is proposed in this paper. An illustrative example of application of the proposed validity measure to the Anderson’s Iris data is given. A comparison of the vali...
متن کاملDiscriminative Bayesian Nonparametric Clustering
We propose a general framework for discriminative Bayesian nonparametric clustering to promote the inter-discrimination among the learned clusters in a fully Bayesian nonparametric (BNP) manner. Our method combines existing BNP clustering and discriminative models by enforcing latent cluster indices to be consistent with the predicted labels resulted from probabilistic discriminative model. Thi...
متن کامل