Session 2pSP: Acoustic Signal Processing for Various Applications 2pSP2. Towards blind reverberation time estimation for non-speech signals
نویسندگان
چکیده
Reverberation time (RT) is an important parameter for room acoustics characterization, intelligibility and quality assessment of reverberant speech, and for dereverberation. Commonly, RT is estimated from the room impulse response (RIR). In practice, however, RIRs are often unavailable or continuously changing. As such, blind estimation of RT based only on the recorded reverberant signals is of great interest. To date, blind RT estimation has focused on reverberant speech signals. Here, we propose to blindly estimate RT from non-speech signals, such as solo instrument recordings and music ensembles. To estimate the RT of non-speech signals, we propose a blind estimator based on an auditoryinspired modulation spectrum signal representation, which measures the modulation frequency of temporal envelopes computed from a 23channel gammatone filterbank. We show that the higher modulation frequency bands are more sensitive to reverberation than the modulation bands below 20 Hz. When tested on a database of non-speech sounds under 23 different reverberation conditions with reverberation time (T40) ranging from 0.18 to 15.62 s, a blind estimator based on the ratio of high-to-low modulation frequencies outperformed two state-of-the-art methods and achieved correlations with EDT as high as 0.92 for solo instruments and 0.87 for ensembles.
منابع مشابه
Performance Comparison of Algorithms for Blind Reverberation Time Estimation from Speech
The reverberation time, T60, is one of the key parameters used to quantify room acoustics. It can provide information about the quality and intelligibility of speech recorded in a reverberant environment, and it can be used to increase robustness to reverberation of speech processing algorithms. T60 can be determined directly from a measurement of the acoustic impulse response, but in situation...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملBlind estimation of room acoustic parameters using kernel regression
Room acoustic parameters are key information for dereverberation or speech recognition. Usually, when one needs to assess the level of reverberation, only the reverberation time RT60 or a direct to reverberant sounds index Dτ is estimated. Yet, methods which blindly estimate the reverberation time from reverberant recorded speech do not always differentiate the RT60 from the Dτ to evaluate the ...
متن کاملBlind Speech Dereverberation
Reverberation, a component of any sound generated in a natural environment, can degrade speech intelligibility or more generally the quality of a signal produced within a room. In a typical setup for teleconferencing, for instance, where the microphones receive both the speech and the reverberation of the surrounding space, it is of interest to have the latter removed from the signal that will ...
متن کاملMicrophone array power ratio for quality assessment of reverberated speech
Speech signals in enclosed environments are often distorted by reverberation and noise. In speech communication systems with several randomly distributed microphones, involving a dynamic speaker and unknown source location, it is of great interest to monitor the perceived quality at each microphone and select the signal with the best quality. Most of existing approaches for quality estimation r...
متن کامل