Damped oscillator cepstral coefficients for robust speech recognition

نویسندگان

Vikramjit Mitra

Horacio Franco

Martin Graciarena

چکیده

This paper presents a new signal-processing technique motivated by the physiology of human auditory system. In this approach, auditory hair cells are modeled as damped oscillators that are stimulated by bandlimited time domain speech signals acting as forcing functions. Oscillation synchrony is induced by time aligning and three-way coupling of the forcing functions across the individual bands such that a given oscillator is induced not only by its critical band’s forcing function but also by its two neighboring functions. We present two separate features; one which uses the damped oscillator response to the forcing functions without synchrony which we name as the Damped Oscillator Cepstral Coefficient (DOCC) and the other which uses the damped oscillator response to a time synchronized forcing function and we name it as the Synchronized Damped Oscillator Cepstral Coefficient (SyDOCC). The proposed features are used in an Aurora4 noiseand channel-degraded speech recognition task, and the results indicate that they improved speech-recognition performance in all conditions compared to the baseline melcepstral feature and other published noise robust features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

In this paper, we present robust feature extractors that incorporate a regularized minimum variance distortionless response (RMVDR) spectrum estimator instead of the discrete Fourier transform-based direct spectrum estimator, used in many front-ends including the conventional MFCC, to estimate the speech power spectrum. Direct spectrum estimators, e.g., single tapered periodogram, have high var...

متن کامل

Feature extraction from higher-lag autocorrelation coefficients for robust speech recognition

In this paper, a feature extraction method that is robust to additive background noise is proposed for automatic speech recognition. Since the background noise corrupts the autocorrelation coefficients of the speech signal mostly at the lowertime lags, while the higher-lag autocorrelation coefficients are least affected, this method discards the lower-lag autocorrelation coefficients and uses o...

متن کامل

Relative mel-frequency cepstral coefficients compensation for robust telephone speech recognition

It is a crucial factor to find the robust and simple computation methods for the actual application of telephone speech recognition. In this paper, we propose a new channel compensation method, which uses a RASTA-like band-pass filter on the mel-frequency cepstral coefficients for robust telephone speech recognition. It is shown from the experiments that the proposed method, comparing with the ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Damped oscillator cepstral coefficients for robust speech recognition

نویسندگان

چکیده

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

Feature extraction from higher-lag autocorrelation coefficients for robust speech recognition

Relative mel-frequency cepstral coefficients compensation for robust telephone speech recognition

عنوان ژورنال:

اشتراک گذاری