EARLYZER: perceptualy motivated robust TFR of speech

نویسندگان

  • J. V. Avadhanulu
  • M. Mathew
  • Thippur V. Sreenivas
چکیده

Development of robust and efficient front-end is crucial for robust ASR. Proper time and frequency resolution of the TFR of speech, motivated by the auditory models is considered an important factor for robustness. An efficient method of realizing a variable resolution TFR using DTFT/Goertzel algorithm is proposed instead of the standard FFT based approach. It is shown that the new representation, called EarLyzer, is more robust than the FFT based Mel frequency cepstral coefficient representation for an automobile noisy speech recognition task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

A novel method of analysing and compa algorithms using auditory time-fre

A new and potentially important method for predicting, analysing and comparing responses of hearing aid algorithms is studied and presented here. This method is based on a timefrequency representation (TFR) generated by a computational auditory model. Hearing impairment is simulated by a change of parameters of the auditory model. To simulate the basilar membrane (BM) filtering part of the audi...

متن کامل

Fusion of Acoustic, Perceptual and Production Features for Robust Speech Recognition in Highly Non-stationary Noise

Improving the robustness of speech recognition systems to cope with adverse background noise is a challenging research topic. Extraction of noise robust acoustic features is one of the prominent methods used for incorporating robustness in speech recognition systems. Prior studies have proposed several perceptually motivated noise robust acoustic features, and the normalized modulation cepstral...

متن کامل

Signal Processing for Robust Speech Recognition

This chapter compares several di erent approaches to robust automatic speech recognition. We review ongoing research in the use of acoustical pre-processing to achieve robust speech recognition, discussing and comparing approaches based on direct cepstral comparisons, on parametric models of environmental degradation, and on cepstral high-pass ltering. We also describe and compare the e ectiven...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999