Kernel MDL to Determine the Number of Clusters
نویسندگان
چکیده
In this paper we propose a new criterion, based on Minimum Description Length (MDL), to estimate an optimal number of clusters. This criterion, called Kernel MDL (KMDL), is particularly adapted to the use of kernel K-means clustering algorithm. Its formulation is based on the definition of MDL derived for Gaussian Mixture Model (GMM). We demonstrate the efficiency of our approach on both synthetic data and real data such as SPOT5 satellite images.
منابع مشابه
MDL-Based Cluster Number Decision Methods for Speaker Clustering and MLLR Adaptation
Speaker clustering is one of the major methods for speaker adaptation. MLLR (Maximum Likelihood Linear Regression) adaptation using transformation matrices corresponding to phone classes/clusters is another useful method especially when the length of utterances for adaptation is limited. In these methods, how to decide the most appropriate number of clusters is an important research issue. This...
متن کاملMML-Based Approach for Finite Dirichlet Mixture Estimation and Selection
This paper proposes an unsupervised algorithm for learning a finite Dirichlet mixture model. An important part of the unsupervised learning problem is determining the number of clusters which best describe the data. We consider here the application of the Minimum Message length (MML) principle to determine the number of clusters. The Model is compared with results obtained by other selection cr...
متن کاملInvestigation on Several Model Selection Criteria for Determining the Number of Cluster
Abstract Determining the number of clusters is a crucial problem in clustering. Conventionally, selection of the number of clusters was effected via cost function based criteria such as Akaike’s information criterion (AIC), the consistent Akaike’s information criterion (CAIC), the minimum description length (MDL) criterion which formally coincides with the Bayesian inference criterion (BIC). In...
متن کاملRobust growing neural gas algorithm with application in cluster analysis
We propose a novel robust clustering algorithm within the Growing Neural Gas (GNG) framework, called Robust Growing Neural Gas (RGNG) network.The Matlab codes are available from . By incorporating several robust strategies, such as outlier resistant scheme, adaptive modulation of learning rates and cluster repulsion method into the traditional GNG framework, the proposed RGNG network possesses ...
متن کاملارایه شاخصی جدید جهت سنجش اعتبار خوشه بندی در الگوریتم های خوشه بندی فازی نوع-2
One of the main issues in fuzzy clustering is to determine the number of clusters that should be available before clustering and selection of different values for the number of clusters will lead to different results. Then, different clusters obtained from different number of clusters should be validated with an index. But so far such an index has not been introduced for interval type-2 fuzzy C...
متن کامل