Modeling temporal dependency for robust estimation of LP model parameters in speech enhancement
نویسندگان
چکیده
This paper presents a novel approach to robust estimation of linear prediction (LP) model parameters in the application of speech enhancement. The robustness stems from the use of prior knowledge on the clean speech and the interfering noise, which are represented by two separate codebooks of LP model parameters. We propose to model the temporal dependency between short-time model parameters with a composite hidden Markov model (HMM) that is constructed by combining the speech and the noise codebooks. Optimal speech model parameters are estimated from the HMM state sequence that best matches the input observation. To further improve the estimation accuracy, we propose to perform interpolation of multiple HMM state sequences such that the estimated speech parameters would not be limited by the codebook coverage. Experimental results demonstrate the benefits and effectiveness of temporal dependency modeling and states interpolation in improving the segmental signal-to-noise ratio, PESQ and spectral distortion of enhanced speech.
منابع مشابه
A Formant Tracking Lp Model for Speech Processing in Car/train Noise
Formant estimation becomes complicated in the presence of correlated background noise such as car and train noise as the spectrum of noise from revolving mechanical sources have their own spectral peaks that affect the number and positions of the observed peaks in noisy speech spectrum. This paper investigates the modeling and estimation of spectral parameters at formants of noisy speech in the...
متن کاملA formant tracking LP model for speech processing
This paper investigates the modeling and estimation of spectral parameters at formants of noisy speech in the presence of car and train noise. Formant estimation using twodimensional hidden Markov models (2D-HMM) is reviewed and employed to study the influence of noise on observations of formants. The first set of experimental results presented show the influence of car and train noise on the d...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملFormant-tracking Linear Prediction Models for Speech Processing in Noisy Enviroments
This paper presents a formant-tracking method for estimation of the time-varying trajectories of a linear prediction (LP) model of speech in noise. The main focus of this work is on the modelling of the non-stationary temporal trajectories of the formants of speech for improved LP model estimation in noise. The proposed approach provides a systematic framework for modelling the inter-frame corr...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کامل