Speech model compensation with direct adaptation of cepstral variance to noisy environment
نویسندگان
چکیده
A modified parallel model combination (PMC) for noisy speech recognition is proposed such that both speech cepstral mean and variance are adapted without the mapping of variance between cepstral and log-spectral domains. By investigating an adapted scalar random variable of log-energy in the way of PMC, we observe that the adapted variance of log-energy can be roughly predicted by the energy ratio of source signals. Based on the observation, we propose that the cepstral variance of the adapted model can be approximated according to the local signal-to-noise ratio (SNR) of a state. The combined cepstral variance is then assigned to be the variance of clean speech, the variance of noise, or the average variance of clean speech and noise. The performance of using this approximation method is compared with the original PMC. Our experiment shows that the degradation of the performance is small, but the proposed method has greatly reduced the computational cost as comparing with the PMC method.
منابع مشابه
Linear interpolation of cepstral variance for noisy speech recognition
Speech model combination with the background noise has been shown effective to improve the pattern classification rate of noisy speech. The model combination can be performed by the addition of the spectral statistics such as the means and the variances. Since the speech feature for pattern classification has to be expressed in the cepstral domain, the combined spectral statistics have to be tr...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملFeature compensation in the cepstral domain employing model combination
In this paper, we present an effective cepstral feature compensation scheme which leverages knowledge of the speech model in order to achieve robust speech recognition. In the proposed scheme, the requirement for a prior noisy speech database in off-line training is eliminated by employing parallel model combination for the noise-corrupted speech model. Gaussian mixture models of clean speech a...
متن کاملSpeech recognition in noisy environments using first-order vector Taylor series
Ž . In this paper, we generalize relations between clean and noisy speech signal using vector Taylor series VTS expansion Ž . for noise-robust speech recognition. We use it for both the noisy data compensation and hidden Markov model HMM parameter adaptation, and apply it for the cepstral domain directly, while Moreno used it to estimate the log-spectral parameters. Also, we develop a detailed ...
متن کاملSpeech feature compensation based on pseudo stereo codebooks for robust speech recognition in additive noise environments
In this paper, we propose several compensation approaches to alleviate the effect of additive noise on speech features for speech recognition. These approaches are simple yet efficient noise reduction techniques that use online constructed pseudo stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transforms for noisecorrupted speech features to ...
متن کامل