clustering validity

Symmetry as A new Measure for Cluster Validity

2002

Chien-Hsing Chou Mu-Chun Su Eugene Lai

In this paper, a cluster validity measure is presented to infer the appropriateness of data partitions. The proposed validity measure adopts a novel non-metric distance measure based on the idea of "point symmetry". The proposed validity measure can be applied in finding the number of clusters of different geometrical structures. The performance evaluation of the validity measure compares favor...

متن کامل

Analysis of Clustering Evaluation Considering Features of Item Response Data Using Data Mining Technique for Setting Cut-Off Scores

Journal: :Symmetry 2017

Byoungwook Kim JaMee Kim Gangman Yi

The setting of standards is a critical process in educational evaluation, but it is time-consuming and expensive because it is generally conducted by an education experts group. The purpose of this paper is to find a suitable cluster validity index that considers the futures of item response data for setting cut-off scores. In this study, nine representative cluster validity indexes were used t...

متن کامل

A Hybrid Fuzzy Clustering Method with a Robust Validity Index

2014

Horng-Lin Shieh

A robust validity index for fuzzy c-means (FCM) algorithm is proposed in this paper. The purpose of fuzzy clustering is to partition a given set of training data into several different clusters that can then be modeled by fuzzy theory. The FCM algorithm has become the most widely used method in fuzzy clustering. Although, there are some successful applications of FCM have been proposed, a disad...

متن کامل

A comprehensive validity index for clustering

Journal: :Intell. Data Anal. 2008

Sandro Saitta Benny Raphael Ian F. C. Smith

Cluster validity indices are used for both estimating the quality of a clustering algorithm and for determining the correct number of clusters in data. Even though several indices exist in the literature, most of them are only relevant for data sets that contain at least two clusters. This paper introduces a new bounded index for cluster validity called the score function (SF), a double exponen...

متن کامل

Combined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL): A Robust Method for Selection of Cluster Number, K

2015

Timothy E. Sweeney Albert C. Chen Olivier Gevaert

In order to discover new subsets (clusters) of a data set, researchers often use algorithms that perform unsupervised clustering, namely, the algorithmic separation of a dataset into some number of distinct clusters. Deciding whether a particular separation (or number of clusters, K) is correct is a sort of 'dark art', with multiple techniques available for assessing the validity of unsupervise...

متن کامل

Nonparametric Genetic Clustering: Comparison of Validity Indices

2001

Sanghamitra Bandyopadhyay Ujjwal Maulik

Variable string length genetic algorithm (GA) is used for developing a novel nonparametric clustering technique when the number of clusters is not fixed a priori. Chromosomes in the same population may now have different lengths since they encode different number of clusters. The crossover operator is redefined to tackle the concept of variable string length. Cluster validity index is used as a...

متن کامل

Clustering validity based on the most similarity

Journal: :CoRR 2013

Raheleh Namayandeh Farzad Didehvar Zahra Shojaei

basic requirement of many studies is the necessity of classifying data. Clustering is a proposed method for summarizing networks. Clustering methods can be divided into two categories named model-based approaches and algorithmic approaches. Since the most of clustering methods depend on their input parameters, it is important to evaluate the result of a clustering algorithm with its' different ...

متن کامل

Cluster Validity Measures Dynamic Clustering Algorithms

2015

S. Angel Latha Mary A. N. Sivagami Usha Rani

Cluster analysis finds its place in many applications especially in data analysis, image processing, pattern recognition, market research by grouping customers based on purchasing pattern, classifying documents on web for information discovery, outlier detection applications and act as a tool to gain insight into the distribution of data to observe characteristics of each cluster. This ensures ...

متن کامل

Relative clustering validity criteria: A comparative overview

Journal: :Statistical Analysis and Data Mining 2010

Lucas Vendramin Ricardo J. G. B. Campello Eduardo R. Hruschka

Many different relative clustering validity criteria exist that are very useful in practice as quantitative measures for evaluating the quality of data partitions, and new criteria have still been proposed from time to time. These criteria are endowed with particular features that may make each of them able to outperform others in specific classes of problems. In addition, they may have complet...

متن کامل

Threshold Validity for Mutual Neighborhood Clustering

Journal: :IEEE Trans. Pattern Anal. Mach. Intell. 1993

Stephen P. Smith

Clustering algorithms have the annoying habit of finding clusters in random data. This note presents a theoretical analysis of the threshold of the mutual neighborhood clustering algorithm (MNCA) [l] under the hypothesis of random data. This yields a theoretical minimum value of this threshold below which even unclustered data is broken into separate clusters. To derive the threshold, a theorem...

متن کامل