Voice Activity Detection in MTF-Based Power Envelope Restoration
نویسندگان
چکیده
This paper reports comparative evaluations of conventional voice activity detection (VAD) methods in reverberant environments. Both conventional and standard (G.729) methods are discussed. In general, these methods work well under clean conditions, but their performance is drastically affected by reverberation. Preliminary comparative evaluations showed that the false acceptance rate (FAR) is significantly increased due to the false rejection rate (FRR) being moderately increased by reverberation. We therefore developed a method using MTFbased power envelope restoration to improve the robustness of VAD in reverberant environments. This restoration method can blindly restore the power envelope of reverberant speech based on the MTF concept. The proposed method consists of an MTFbased restoration method as the front end and a conventional VAD method as the final decision. Experimental results demonstrated that the proposed method is superior to conventional methods with regard to robustness and providing accurate VAD (reducing both FAR and FRR) in reverberant environments.
منابع مشابه
Refinement of an MTF-based speech dereverberation method using an optimal inverse-MTF filter
We previously proposed a speech dereverberation method based on the modulation transfer function (MTF). This method consists of power envelope restoration and carrier regeneration processes, and reduces both the loss due to degraded power envelopes and the loss of speech intelligibility. In the power envelope restoration, however, whether adaptive time-frequency division provides the best repre...
متن کاملAn MTF-based blind restoration of temporal power envelopes as a front-end processor for automatic speech recognition systems in reverberant environments
To reduce speech degradation in reverberant environments, we previously proposed a modulation transfer function (MTF) based method of speech restoration. The room impulse response (RIR) in this restoration does not need to be measured at any time since we modeled the power envelope of the RIRs as an exponential decay function. Speech is assumed to be temporal modulated with white noise carrier ...
متن کاملA New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملExperimental comparison of turbulence modulation transfer function and aerosol modulation transfer function through the open atmosphere
Although turbulence is usually considered to be the primary cause of image blur, simultaneous and independent measurements of overall atmospheric modulation transfer function (MTF) and turbulence MTF over fairly long horizontal paths at 15-m average elevation indicate that, even at midday, aerosol MTF deriving from forward scatter is usually more dominant than turbulence MTF. Three different ex...
متن کاملThe Rs Images Restoration of Cbers-2 Based on Atmospheric Mtf Evaluation Using Meteorological Data
Atmospheric MTF(Modulation Transfer Function) takes a critical role in restoration of atmospheric blurred RS(Remote Sensing) images. This MTF can be determined by meteorological data. Among the existing deblurring methods of RS images, the meteorological data is not used in estimation of atmospheric MTF. For CBERS-02(China and Brazil Earth Resource Satellite) RS image data, a new deblurring alg...
متن کامل