Robust speech recognition using HMM's with toeplitz state covariance matrices
نویسندگان
چکیده
Hidden Markov modeling of speech waveforms is studied and applied to speech recognition of clean and noisy signals. Signal vectors in each state are assumed Gaussian with zero mean and a Toeplitz covariance matrix. This model allows short signal vectors and thus is useful for speech signals with rapidly changing second order statistics. It can also be straightforwardly adapted to noisy signals especially when the noise is additive and independent of the signal. Since no closed form solution exists for the maximum likelihood estimate of the Toeplitz covariance matrices, an expectation-maximization procedure was used and e ciently implemented. HMM's with Toeplitz as well as asymptotically Toeplitz (e.g., circulant, autoregressive) covariance matrices are theoretically and experimentally studied. While asymptotically all of these matrices provide similar performance, they di er signi cantly when the frame length is nite. Recognition results are provided for clean and noisy signals at 0-30dB SNR.
منابع مشابه
Hidden Markov modeling of speech using Toeplitz covariance matrices
Hidden Markov modeling of speech waveforms using structured covariance matrices is studied and applied to recognition of clean and noisy speech signals. This technique allows for easier model adaptation in additive noise than does cepstral modeling of speech. Waveform modeling using autoregressive (AR) structured covariances has been extensively studied and applied previously. However, other co...
متن کاملOptimal Rates of Convergence for Estimating Toeplitz Covariance Matrices
Toeplitz covariance matrices are used in the analysis of stationary stochastic processes and a wide range of applications including radar imaging, target detection, speech recognition, and communications systems. In this paper, we consider optimal estimation of large Toeplitz covariance matrices and establish the minimax rate of convergence for two commonly used parameter spaces under the spect...
متن کاملRobust speech recognition via modeling spectral coefficients with HMM's with complex Gaussian components
Robust speech recognition via hidden Markov modeling of spectral vectors is studied in this paper. The hidden Markov model (HMM) mixture components are assumed complex Gaussian with zero mean, diagonal covariance, and with incorporating an unknown scalar gain term. The gain term is associated with each spectral vector and it models the varying energy of speech signals. It is estimated by applyi...
متن کاملCombined Multi-Channel NMF-Based Robust Beamforming for Noisy Speech Recognition
We propose a novel acoustic beamforming method using blind source separation (BSS) techniques based on non-negative matrix factorization (NMF). In conventional mask-based approaches, hard or soft masks are estimated and beamforming is performed using speech and noise spatial covariance matrices calculated from masked noisy observations, but the phase information of the target speech is not adeq...
متن کاملModeling with a subspace constraint on inverse covariance matrices
We consider a family of Gaussian mixture models for use in HMM based speech recognition system. These “SPAM” models have state independent choices of subspaces to which the precision (inverse covariance) matrices and means are restricted to belong. They provide a flexible tool for robust, compact, and fast acoustic modeling. The focus of this paper is on the case where the means are unconstrain...
متن کامل