Fuzzy clustering of categorical data using fuzzy centroids
نویسندگان
چکیده
In this paper the conventional fuzzy k-modes algorithm for clustering categorical data is extended by representing the clusters of categorical data with fuzzy centroids instead of the hard-type centroids used in the original algorithm. Use of fuzzy centroids makes it possible to fully exploit the power of fuzzy sets in representing the uncertainty in the classification of categorical data. To test the proposed approach, the proposed algorithm and two conventional algorithms (the k-modes and fuzzy k-modes algorithms) were used to cluster three categorical data sets. The proposed method was found to give markedly better clustering results. 2004 Elsevier B.V. All rights reserved.
منابع مشابه
A fuzzy k-partitions model for categorical data and its comparison to the GoM model
The grade of membership (GoM) model uses fuzzy sets as memberships of each individual to extreme profiles (or classes) on the likelihood function of multivariate multinomial distributions. The GoM clustering algorithm derived from the GoM model is used in cluster analysis for categorical data, but it is iterated with complicated calculations. In this paper we create another approach, termed a f...
متن کاملHierarchical clustering algorithm for categorical data using a probabilistic rough set model
Several clustering analysis techniques for categorical data exist to divide similar objects into groups. Some are able to handle uncertainty in the clustering process, whereas others have stability issues. In this paper, we propose a new technique called TMDP (Total Mean Distribution Precision) for selecting the partitioning attribute based on probabilistic rough set theory. On the basis of thi...
متن کاملModified Particle Swarm Optimization Based Adaptive Fuzzy K-Modes Clustering for Heterogeneous Medical Databases
The main purpose of data mining is to extract hidden predictive knowledge of useful information and patterns of data from large databases for utilizing it in decision support. Medical field has large amount of various heterogeneous databases, in which the extraction of hidden useful knowledge for the classification of data is difficult one. In order to cluster and classify the whole databases o...
متن کاملA fuzzy k-modes algorithm for clustering categorical data
This correspondence describes extensions to the fuzzy k-means algorithm for clustering categorical data. By using a simple matching dissimilarity measure for categorical objects and modes instead of means for clusters, a new approach is developed, which allows the use of the k-means paradigm to efficiently cluster large categorical data sets. A fuzzy k-modes algorithm is presented and the effec...
متن کاملA New Kernelized Fuzzy C-Means Clustering Algorithm with Enhanced Performance
Recently Kernelized Fuzzy C-Means clustering technique where a kernel-induced distance function is used as a similarity measure instead of a Euclidean distance which is used in the conventional Fuzzy C-Means clustering technique, has earned popularity among research community. Like the conventional Fuzzy C-Means clustering technique this technique also suffers from inconsistency in its performa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 25 شماره
صفحات -
تاریخ انتشار 2004