A generalized Stein's estimation approach for speech enhancement based on perceptual criteria
نویسندگان
چکیده
We address the problem of speech enhancement using a riskestimation approach. In particular, we propose the use the Stein’s unbiased risk estimator (SURE) for solving the problem. The need for a suitable finite-sample risk estimator arises because the actual risks invariably depend on the unknown ground truth. We consider the popular mean-squared error (MSE) criterion first, and then compare it against the perceptually-motivated Itakura-Saito (IS) distortion, by deriving unbiased estimators of the corresponding risks. We use a generalized SURE (GSURE) development, recently proposed by Eldar for MSE. We consider dependent observation models from the exponential family with an additive noise model, and derive an unbiased estimator for the risk corresponding to the IS distortion, which is non-quadratic. This serves to address the speech enhancement problem in a more general setting. Experimental results illustrate that the IS metric is efficient in suppressing musical noise, which affects the MSE-enhanced speech. However, in terms of global signal-to-noise ratio (SNR), the minimum MSE solution gives better results.
منابع مشابه
Weighted Log-spectral Amplitude Estimation with Generalized Gamma Distribution under Speech Presence Probability
In this paper, we propose a speech enhancement approach. The approach is based on deriving weighted log-spectral amplitude estimator that exploits the generalized Gamma distributed speech priors under speech presence probability. The log-spectral amplitude estimator is weighted by psychoacoustically motivated speech distortion measure to take advantage of the perceptual interpretation. The expe...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملGeneralized multi-microphone spectral amplitude estimation based on correlated noise model
Enhancing speech contaminated by uncorrelated additive noise, when the degraded speech alone is available, has received much attention. In recent years many systems have used multi-microphone arrays for the task of speech enhancement and robust speech recognition. In this paper we introduce a generalized multi-microphone spectral amplitude estimation approach based on a model with non-negligibl...
متن کاملPerceptual Wavelet Adaptive D
This paper introduces a novel speech enhancement system based on a wavelet denoising framework. In this system, the noisy speech is first preprocessed using a generalized spectral subtraction method to initially lower the noise level with negligible speech distortion. A perceptual wavelet transform is then used to decompose the resulting speech signal into critical bands. Threshold estimation i...
متن کاملSpeech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models
In this paper, we develop and evaluate speech enhancement algorithms, which are based on supergaussian generalized autoregressive conditional heteroscedasticity (GARCH) models in the short-time Fourier transform (STFT) domain. We consider three different statistical models, two fidelity criteria, and two approaches for the estimation of the variances of the STFT coefficients. The statistical mo...
متن کامل