Spectro-temporal activity pattern (STAP) features for noise robust ASR

نویسندگان

Shajith Ikbal

Mathew Magimai-Doss

Hemant Misra

Hervé Bourlard

چکیده

In this paper, we introduce a new noise robust representation of speech signal obtained by locating points of potential importance in the spectrogram, and parameterizing the activity of time-frequency pattern around those points. These features are referred to as Spectro-Temporal Activity Pattern (STAP) features. The suitability of these features for noise robust speech recognition is examined for a particular parameterization scheme where spectral peaks are chosen as points of potential importance. The activity in the time-frequency patterns around these points are parameterized by measuring the dynamics of the patterns along both time and frequency axes. As the spectral peaks are considered to constitute an important and robust cue for speech recognition, this representation is expected to yield a robust performance. An interesting result of the study is that inspite of using a relatively less amount of information from the speech signal, STAP features are able to achieve a reasonable recognition performance in clean speech, when compared to the state-of-the-art features. In addition, STAP features produce a significantly better performance in high noise conditions. An entropy based combination technique in tandem frame-work to combine STAP features with standard features yields a system which is more robust in all conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nonlinear Feature Transformations for Noise Robust Speech Recognition

Robustness against external noise is an important requirement for automatic speech recognition (ASR) systems, when it comes to deploying them for practical applications. This thesis proposes and evaluates new feature-based approaches for improving the ASR noise robustness. These approaches are based on nonlinear transformations that, when applied to the spectrum or feature, aim to emphasize the...

متن کامل

Methods for capturing spectro-temporal modulations in automatic speech recognition

Psychoacoustical and neurophysiological results indicate that spectro-temporal modulations play an important role in sound perception. Speech signals, in particular, exhibit distinct spectro-temporal patterns which are well matched by receptive fields of cortical neurons. In order to improve the performance of automatic speech recognition (ASR) systems a number of different approaches are prese...

متن کامل

Using spectro-temporal features to improve AFE feature extraction for ASR

Previous work has shown that spectro-temporal features reduce WER for automatic speech recognition under noisy conditions. The spectro-temporal framework, however, is not the only way to process features in order to reduce errors due to noise in the signal. The two-stage mel-warped Wiener filtering method used in the “Advanced Front End” (AFE), now a standard front end for robust recognition, i...

متن کامل

Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition.

In an attempt to increase the robustness of automatic speech recognition (ASR) systems, a feature extraction scheme is proposed that takes spectro-temporal modulation frequencies (MF) into account. This physiologically inspired approach uses a two-dimensional filter bank based on Gabor filters, which limits the redundant information between feature components, and also results in physically int...

متن کامل

Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition.

To test if simultaneous spectral and temporal processing is required to extract robust features for automatic speech recognition (ASR), the robust spectro-temporal two-dimensional-Gabor filter bank (GBFB) front-end from Schädler, Meyer, and Kollmeier [J. Acoust. Soc. Am. 131, 4134-4151 (2012)] was de-composed into a spectral one-dimensional-Gabor filter bank and a temporal one-dimensional-Gabor...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Spectro-temporal activity pattern (STAP) features for noise robust ASR

نویسندگان

چکیده

منابع مشابه

Nonlinear Feature Transformations for Noise Robust Speech Recognition

Methods for capturing spectro-temporal modulations in automatic speech recognition

Using spectro-temporal features to improve AFE feature extraction for ASR

Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition.

Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition.

عنوان ژورنال:

اشتراک گذاری