MIC: Mutual Information based hierarchical Clustering
نویسندگان
چکیده
Clustering is a concept used in a huge variety of applications. We review a conceptually very simple algorithm for hierarchical clustering called in the following the mutual information clustering (MIC) algorithm. It uses mutual information (MI) as a similarity measure and exploits its grouping property: The MI between three objects X ,Y, and Z is equal to the sum of the MI between X and Y , plus the MI between Z and the combined object (XY ). We use MIC both in the Shannon (probabilistic) version of information theory, where the “objects” are probability distributions represented by random samples, and in the Kolmogorov (algorithmic) version, where the “objects” are symbol sequences. We apply our method to the construction of phylogenetic trees from mitochondrial DNA sequences and we reconstruct the fetal ECG from the output of independent components analysis (ICA) applied to the ECG of a pregnant woman.
منابع مشابه
Hierarchical Clustering Using Mutual Information
We present a method for hierarchical clustering of data called mutual information clustering (MIC) algorithm. It uses mutual information (MI) as a similarity measure and exploits its grouping property: The MI between three objects X, Y, and Z is equal to the sum of the MI between X and Y , plus the MI between Z and the combined object (XY ). We use this both in the Shannon (probabilistic) versi...
متن کاملHierarchical Clustering Based on Mutual Information
Motivation: Clustering is a frequently used concept in variety of bioinformatical applications. We present a new method for hierarchical clustering of data called mutual information clustering (MIC) algorithm. It uses mutual information (MI) as a similarity measure and exploits its grouping property: The MI between three objects X,Y, and Z is equal to the sum of the MI between X and Y , plus th...
متن کاملar X iv : q - b io . Q M / 0 31 10 37 v 1 2 7 N ov 2 00 3 Hierarchical Clustering Using Mutual Information
We present a method for hierarchical clustering of data called mutual information clustering (MIC) algorithm. It uses mutual information (MI) as a similarity measure and exploits its grouping property: The MI between three objects X, Y, and Z is equal to the sum of the MI between X and Y , plus the MI between Z and the combined object (XY ). We use this both in the Shannon (probabilistic) versi...
متن کاملClustering of a Number of Genes Affecting in Milk Production using Information Theory and Mutual Information
Information theory is a branch of mathematics. Information theory is used in genetic and bioinformatics analyses and can be used for many analyses related to the biological structures and sequences. Bio-computational grouping of genes facilitates genetic analysis, sequencing and structural-based analyses. In this study, after retrieving gene and exon DNA sequences affecting milk yield in dairy ...
متن کاملA Bayesian Alternative to Mutual Information for the Hierarchical Clustering of Dependent Random Variables
The use of mutual information as a similarity measure in agglomerative hierarchical clustering (AHC) raises an important issue: some correction needs to be applied for the dimensionality of variables. In this work, we formulate the decision of merging dependent multivariate normal variables in an AHC procedure as a Bayesian model comparison. We found that the Bayesian formulation naturally shri...
متن کامل