Large margin estimation of Gaussian mixture model parameters with extended baum-welch for spoken language recognition
نویسندگان
چکیده
Discriminative training (DT) methods of acoustic models, such as SVM and MMI-training GMM, have been proved effective in spoken language recognition. In this paper we propose a DT method for GMM using the large margin (LM) estimation. Unlike traditional MMI or MCE methods, the LM estimation attempts to enhance the generalization ability of GMM to deal with new data that exhibits mismatch with training data. We define the multi-class separation margin as a function of GMM likelihoods, and derive update formulae of GMM parameters with the extended Baum-Welch algorithm. Results on the NIST language recognition evaluation (LRE) 2007 task show that the LM estimation achieves better performance and faster convergent speed than the MMI estimation.
منابع مشابه
Maximum margin hidden Markov models for sequence classification
Discriminative learning methods are known to work well in pattern classification tasks and often show benefits compared to generative learning. This is particularly true in case of model mismatch, i.e. the model cannot represent the true data distribution. In this paper, we derive discriminative maximum margin learning for hidden Markov models (HMMs) with emission probabilities represented by G...
متن کاملAudio classification using extended baum-welch transformations
Audio classification has applications in a variety of contexts, such as automatic sound analysis, supervised audio segmentation and in audio information search and retrieval. Extended Baum-Welch (EBW) transformations are most commonly used as a discriminative technique for estimating parameters of Gaussian mixtures, though recently they have been applied in unsupervised audio segmentation. In t...
متن کاملA General Approximation-Optimization Approach to Large Margin Estimation of HMMs
The most successful modeling approach to automatic speech recognition (ASR) is to use a set of hidden Markov models (HMMs) as the acoustic models for subword or whole-word speech units and to use the statistical N-gram model as language model for words and/or word classes in sentences. All the model parameters, including HMMs and N-gram models, are estimated from a large amount of training data...
متن کاملExtended Baum-welch Reestimation of G on Reverse Jensen In
In this paper we derive the well known EBW reestimation formulae for Gaussian mixture models using the recently proposed reverse Jensen inequality. In addition to the simplicity of the derivation, it leads to closed form expressions for the D of each Gaussian in the mixture. Using some approximations, it is shown that the expressions can be reduced to the popular formula of [8] with a Gaussian ...
متن کاملA Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models
We describe the maximum-likelihood parameter estimation problem and how the ExpectationMaximization (EM) algorithm can be used for its solution. We first describe the abstract form of the EM algorithm as it is often given in the literature. We then develop the EM parameter estimation procedure for two applications: 1) finding the parameters of a mixture of Gaussian densities, and 2) finding the...
متن کامل