Cluster validation using information stability measures
نویسندگان
چکیده
0167-8655/$ see front matter 2009 Elsevier B.V. A doi:10.1016/j.patrec.2009.07.009 * Corresponding author. Fax: +34 964 728435. E-mail addresses: [email protected] (D. Pa [email protected] (J.S. Sánchez). In this work, a novel technique to address the problem of cluster validation based on cluster stability properties is presented. The stability index here proposed is based on the variation on some information measures over the partitions generated by a given clustering model due to the variability in clustering solutions produced by different sample sets. 2009 Elsevier B.V. All rights reserved.
منابع مشابه
Cluster Stability for Finite Samples
Over the past few years, the notion of stability in data clustering has received growing attention as a cluster validation criterion in a sample-based framework. However, recent work has shown that as the sample size increases, any clustering model will usually become asymptotically stable. This led to the conclusion that stability is lacking as a theoretical and practical tool. The discrepancy...
متن کاملCluster Validity Measures Dynamic Clustering Algorithms
Cluster analysis finds its place in many applications especially in data analysis, image processing, pattern recognition, market research by grouping customers based on purchasing pattern, classifying documents on web for information discovery, outlier detection applications and act as a tool to gain insight into the distribution of data to observe characteristics of each cluster. This ensures ...
متن کاملA Resampling Approach to Cluster Validation
The concept of cluster stability is introduced as a means for assessing the validity of data partitionings found by clustering algorithms. It allows us to explicitly quantify the quality of a clustering solution, without being dependent on external information. The principle of maximizing the cluster stability can be interpreted as choosing the most self-consistent data partitioning. We present...
متن کاملclValid , an R package for cluster validation
The R package clValid contains functions for validating the results of a clustering analysis. There are three main types of cluster validation measures available, “internal”, “stability”, and “biological”. The user can choose from nine clustering algorithms in existing R packages, including hierarchical, K-means, self-organizing maps (SOM), and model based clustering. In addition, we provide a ...
متن کاملA New Asymmetric Criterion for Cluster Validation
In this paper a new criterion for clusters validation is proposed. Many stability measures to validate a cluster have been proposed such as Normalized Mutual Information. We propose a new criterion for clusters validation. The drawback of the common approach is discussed in this paper and then a new asymmetric criterion is proposed to assess the association between a cluster and a partition whi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 31 شماره
صفحات -
تاریخ انتشار 2010