Unsupervised training of HMMs with variable number of mixture components per state
نویسندگان
چکیده
In this work automatic methods for determining the number of gaussians per state in a set of Hidden Markov Models are studied. Four different mix-up criteria are proposed to decide how to increase the size of the states. These criteria, derived from Maximum Likelihood scores, are focused to increase the discrimination between states obtaining different number of gaussians per state. We compare these proposed methods with the common approach where the number of density functions used in every state is equal and pre-fixed by the designer. Experimental results demonstrate that performance can be maintained while reducing the total number of density functions by 17% (from 2046 down to 1705). These results are obtained in a flexible large vocabulary isolated word recognizer using context dependent models.
منابع مشابه
Comparison of Tied-Mixture and State-Clustered HMMs with Respect to Recognition Performance and Training Method
Tied-mixture HMMs have been proposed as the acoustic model for large-vocabulary continuous speech recognition and have yielded promising results. They share base-distribution and provide more flexibility in choosing the degree of tying than state-clustered HMMs. However, it is unclear which acoustic models to superior to the other under the same training data. Moreover, LBG algorithm and EM alg...
متن کاملNoise-Robust Hidden Markov Models for Limited Training Data for Within-Species Bird Phrase Classification
Hidden Markov Models (HMMs) have been studied and used extensively in speech and birdsong recognition, but they are not robust to limited training data and noise. This paper presents two novel approaches to training continuous and discrete HMMs with extremely limited data. First, the algorithm learns the global Gaussian Mixture Models (GMMs) for all training phrases available. GMM parameters ar...
متن کاملA scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR
We present a new scalable approach to using deep neural network (DNN) derived features in Gaussian mixture density hidden Markov model (GMM-HMM) based acoustic modeling for large vocabulary continuous speech recognition (LVCSR). The DNN-based feature extractor is trained from a subset of training data to mitigate the scalability issue of DNN training, while GMM-HMMs are trained by using state-o...
متن کاملModel Selection for Mixture Models Using Perfect Sample
We have considered a perfect sample method for model selection of finite mixture models with either known (fixed) or unknown number of components which can be applied in the most general setting with assumptions on the relation between the rival models and the true distribution. It is, both, one or neither to be well-specified or mis-specified, they may be nested or non-nested. We consider mixt...
متن کاملAutomatic Generation of Non-uniform and Context-Dependent HMMs Based on the Variational Bayesian Approach
We propose a new method both for automatically creating non-uniform, context-dependent HMM topologies, and selecting the number of mixture components based on the Variational Bayesian (VB) approach. Although the Maximum Likelihood (ML) criterion is generally used to create HMM topologies, it has an over-fitting problem. Recently, to avoid this problem, the VB approach has been applied to create...
متن کامل