A Noise-Resistant Fuzzy C Means Algorithm for Clustering - Fuzzy Systems Proceedings, 1998. IEEE World Congress on Computational Intelligence., The 1998 IEEE
نویسندگان
چکیده
Probabilistic clustering techniques use the concept of memberships to describe the degree by which a vector belongs to a cluster. The use of memberships provides probabilistic methods with more realistic clustering than “hard” techniques. However, fuzzy schemes (like the Fuzzy c Means algorithm, FCW are open sensitive to outliers. We review four existing algorithms, devised to reduce this sensitivity. These are: the noise cluster (NC) algorithm of Dave, the Possibilistic c Means (PCW scheme of KelIer and Krishnapuram, the Least Biased Fuzzy Clustering (LBFC) method of Beni and Liu, and the Fuzzy Possibilistic c Means algorithm of Pal et al. We then propose the new Credibilistic Fuzzy c Means (CFCM) algorithm to improve on these methods. It uses a new variable, credibility of a vector, which measures the typicality of the vector to the whole data set. By taking credibility into account, CFCMgenerates centroid which are less sensitive to outliers than other techniques, and closer to the centroidr generated when the outliers are artificially removed. 1. Existing “Robust” Clustering Techniques Clustering algorithms [l] partition a data set such as X = { X I , xZ,.. $,, I V i }, to reveal the underlying structure of X. The developing of clustering techniques is a mature art, and there exist families of “hard,” “fuzzy” and “possibilistic” algorithms, yielding different set partitions and drawing on different principles and objective functions. “Hard” or “crisp” models unequivocally assign each vector in the data set X into one of c subsets, where c is a natural number. Fuzzy and possibilistic clustering algorithms generate a membership matrix U, whose elements assume values in [O,l]. If uIk = a, we say that the membership of xk in cluster i is a. In this paper we denote the data set as X , its n vectors as (x,}:=, where xi E !Rp V i and the cluster centers (centroids) vi E CJip b’i=l,. . . ,c. We collect the centroids In the matrix V=[VI, VZ, ..., vJT The Fuzzy c Means Algorithm (FCM) [2] is the most popular fuzzy clustering algorithm. It assumes that the number of clusters c, is known apriori, and minimizes
منابع مشابه
Multi-Objective Genetic Local Search for Minimizing the Number of Fuzzy Rules for Pattern Classifica - Fuzzy Systems Proceedings, 1998. IEEE World Congress on Computational Intelligence., The 1998 IEEE
For constructing compact fuzzy rule-based systems with high classification performance, we have already formulated a rule selection problem. Our rule selection problem has two objectives: to minimize the number of selected fuzzy if-then rules (i.e., to minimize the fuzzy rule base) and to maximize the number of correctly classified patterns (i.e., to maximize the classification performance). In...
متن کاملBilateral Weighted Fuzzy C-Means Clustering
Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...
متن کاملEditorial: Welcome To The IEEE Neural Networks Society
I WANT towelcomeyou toournewly formedsociety.On February 17, 2002, the IEEE Neural Networks Council (NNC), publisher of the IEEE TRANSACTIONS ON NEURAL NETWORKS (TNN), the IEEE TRANSACTIONS ON FUZZY SYSTEMS (TFS), and the IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION (TEC), became the IEEE Neural Networks Society (NNS). This accomplishment was made possible by the relentless efforts of our ExCo...
متن کاملA Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...
متن کاملFuzzy clustering for symbolic data
Most of the techniques used in the literature in clustering symbolic data are based on the hierarchical methodology, which utilizes the concept of agglomerative or divisive methods as the core of the algorithm. The main contribution of this paper is to show how to apply the concept of fuzziness on a data set of symbolic objects and how to use this concept in formulating the clustering problem o...
متن کامل