Forward-backwards training of hybrid HMM/BN acoustic models
نویسندگان
چکیده
In this paper, we describe an application of the Forward-Backwards (F-B) algorithm for maximum likelihood training of hybrid HMM/Bayesian Network (BN) acoustic models. Previously, HMM/BN parameter estimation was based on a Viterbi training algorithm that requires two passes over the training data: one for BN learning and one for updating HMM transition probabilities. In this work, we first analyze the F-B training for a conventional HMM and show that the state PDF parameter estimation is analogous to weighted-data classifier training. The gamma variable of the Forward-Backwards algorithm plays the role of the data weight. From this perspective, it is straightforward to apply FB-based training to the HMM/BN models since the BN learning algorithm allows training with weighted data. Experiments on accented speech (American, British and Australian English) show that F-B training outperforms the previous Viterbi learning approach and that the HMM/BN model achieved better performance than the conventional HMM.
منابع مشابه
Using Hybrid HMM/BN Acoustic Models: Design and Implementation Issues
In recent years, the number of studies investigating new directions in speech modeling that goes beyond the conventional HMM has increased considerably. One promising approach is to use Bayesian Networks (BN) as speech models. Full recognition systems based on Dynamic BN as well as acoustic models using BN have been proposed lately. Our group at ATR has been developing a hybrid HMM/BN model, wh...
متن کاملAdvanced Acoustic Modeling with the Hybrid HMM/BN Framework
Most of the current state-of-the-art speech recognition systems are based on HMMs which usually use mixture of Gaussian functions as state probability distribution model. It is a common practice to use EM algorithm for Gaussian mixture parameter learning. In this case, the learning is done in a ”blind”, data-driven way without taking into account how the speech signal has been produced and whic...
متن کاملHybrid HMM/BN LVCSR system integrating multiple acoustic features
In current HMM based speech recognition systems, it is difficult to supplement acoustic spectrum features with additional information such as pitch, gender, articulator positions, etc. On the other hand, Dynamic Bayesian Networks (DBN) allow for easy combination of different features and make use of conditional dependencies between them. However, lack of efficient algorithms has prevented their...
متن کاملAcoustic Modeling of Accented English Speech for Large-vocabulary Speech Recognition
In this paper, we present a study on robust speech recognition with respect to accent variations. Differences that characterize accents in speech can be divided into two parts: phonetic and acoustic. We focus on the acoustic differences and the ways of acoustic model design and training that can be used to minimize the effect of accent variations on the speech recognition system’s performance. ...
متن کاملIntegration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework
Most of the current state-of-the-art speech recognition systems are based on speech signal parametrizations that crudely model the behavior of the human auditory system. However, little or no use is usually made of the knowledge on the human speech production system. A data-driven statistical approach to incorporate this knowledge into ASR would require a substantial amount of data, which are n...
متن کامل