Agglomerative Information Bottleneck
نویسندگان
چکیده
We introduce a novel distributional clustering algorithm that explicitly maximizes the mutual information per cluster between the data and given categories. This algorithm can be considered as a bottom up hard version of the recently introduced “Information Bottleneck Method”. We relate the mutual information between clusters and categories to the Bayesian classification error, which provides another motivation for using the obtained clusters as features. The algorithm is compared with the top-down soft version of the information bottleneck method and a relationship between the hard and soft results is established. We demonstrate the algorithm on the 20 Newsgroups data set. For a subset of two news-groups we achieve compression by 3 orders of magnitudes loosing only 10% of the original mutual information.
منابع مشابه
Agglomerative Multivariate Information Bottleneck
The Information bottleneck method is an unsupervised non-parametric data organization technique. Given a joint distribution P (A;B), this method constructs a new variable T that extracts partitions, or clusters, over the values of A that are informative about B. In a recent paper, we introduced a general principled framework for multivariate extensions of the information bottleneck method that ...
متن کاملInformation Bottleneck Co-clustering
Co-clustering has emerged as an important approach for mining contingency data matrices. We present a novel approach to co-clustering based on the Information Bottleneck principle, called Information Bottleneck Co-clustering (IBCC), which supports both soft-partition and hardpartition co-clusterings, and leverages an annealing-style strategy to bypass local optima. Existing co-clustering method...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملOptimal Kullback-Leibler Aggregation via Information Bottleneck
In this paper, we present a method for reducing a regular, discrete-time Markov chain (DTMC) to another DTMC with a given, typically much smaller number of states. The cost of reduction is defined as the Kullback–Leibler divergence rate between a projection of the original process through a partition function and a DTMC on the correspondingly partitioned state space. Finding the reduced model w...
متن کامل