Improved Hidden Markov Model Speech Recognition Using Radial Basis Function Networks
نویسندگان
چکیده
A high performance speaker-independent isolated-word hybrid speech recognizer was developed which combines Hidden Markov Models (HMMs) and Radial Basis Function (RBF) neural networks. In recognition experiments using a speaker-independent E-set database, the hybrid recognizer had an error rate of 11.5% compared to 15.7% for the robust unimodal Gaussian HMM recognizer upon which the hybrid system was based. These results and additional experiments demonstrate that RBF networks can be successfully incorporated in hybrid recognizers and suggest that they may be capable of good performance with fewer parameters than required by Gaussian mixture classifiers. A global parameter optimization method designed to minimize the overall word error rather than the frame recognition error failed to reduce the error rate. 1 HMM/RBF HYBRID RECOGNIZER A hybrid isolated-word speech recognizer was developed which combines neural network and Hidden Markov Model (HMM) approaches. The hybrid approach is an attempt to capitalize on the superior static pattern classification performance of neural network classifiers [6] while preserving the temporal alignment properties of HMM Viterbi decoding. Our approach is unique when compared to other studies [2, 5] in that we use Radial Basis Function (RBF) rather than multilayer sigmoidal networks. RBF networks were chosen because their static pattern classification performance is comparable to that of other networks and they can be trained rapidly using a one-pass matrix inversion technique [8] . The hybrid HMM/RBF isolated-word recognizer is shown in Figure 1. For each 159 160 Singer and Lippmann
منابع مشابه
Improved Hidden Markov Models Speech Recognition Using Radial Basis Function Networks
A high performance speaker-independent isolated-word hybrid speech recognizer was developed which combines Hidden Markov Models (HMMs) and Radial Basis Function (RBF) neural networks. In recognition experiments using a speaker-independent E-set database, the hybrid recognizer had an error rate of 11.5% compared to 15.7% for the robust unimodal Gaussian HMM recognizer upon which the hybrid syste...
متن کاملA Hybrid Speech Recognition System with Hidden Markov Model and Radial Basis Function Neural Network
We analyze the performance of continuous speech recognition of a speaker independent system using Hidden Markov Model and Artificial Neural Network. Modern speech recognition systems use different combinations of the standard techniques over the basic approach to improve performance accuracy. One such combination which has gained more attention is the hybrid model. Our hybrid system for continu...
متن کاملProbability Estimation by Feed-forward Networks in Continuous Speech Recognition
We review the use of feed-forward networks as estimators of probability densities in hidden Markov modelling. In this paper we are mostly concerned with radial basis functions (RBF) networks. We note the isomorphism of RBF networks to tied mixture density estimators; additionally we note that RBF networks are trained to estimate posteriors rather than the likelihoods estimated by tied mixture d...
متن کاملHybrid model decomposition of speech and noise in a radial basis function neural model framework
This paper focus on a new approach to automatic speech recognition in noisy environments where the noise has either stationary or non-stationary statistical characteristics. The aim is to perform automatic recognition of speech in the precence of additive car noise. The technique applied is based on a combination of the Hidden Markov Model (HMM) decomposition method [ 11, for speech recognition...
متن کاملSpeaker verification based on phonetic decision making
Speaker verification based on phone modelling is examined in this paper. Phone modelling is attractive, because different phonemes have different levels of usefulness for speaker recognition, and because phone modelling essentially makes a speaker verification algorithm text independent. The speaker verification system used here is based on a two stage approach, where speech recognition (segmen...
متن کامل