Improving the performance of MFCC for Persian robust speech recognition

Authors

  • D. Darabian Department of Electrical Engineering, University of Shahrood, Shahrood, Iran.
  • H. Marvi Department of Electrical Engineering, University of Shahrood, Shahrood, Iran.
Abstract:

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to the noisy original speech signal. The pre-emphasized original  speech segmented into overlapping time frames, then it is windowed by a modified hamming window .Higher order autocorrelation coefficients are extracted. The next step is to eliminate the lower order of the autocorrelation coefficients. The consequence pass from FFT block and then power spectrum of output is calculated. A Gaussian shape filter bank is applied to the results. Logarithm and two compensator blocks form which one is mean subtraction and the other one are root block applied to the results and DCT transformation is the last step. We use MLP neural network to evaluate the performance of proposed MFCC method and to classify the results. Some speech recognition experiments for various tasks indicate that the proposed algorithm is more robust than traditional ones in noisy condition.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

improving the performance of mfcc for persian robust speech recognition

the mel frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. in this paper to achieve a satisfactorily performance in automatic speech recognition (asr) applications we introduce a noise robust new set of mfcc vector estimated through following steps. first, spectral mean normalization is a pre-processing which applies to t...

full text

Spectral Normalisation MFCC Derived Features for Robust Speech Recognition

This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are i...

full text

Robust speech/non-speech detection using LDA applied to MFCC for continuous speech recognition

Continuous speech recognition applications need precise detection because the number of words to recognize is unknown and vocabulary words can be short. The speech/non-speech detection must be robust to the boundary precision. In this work, a new approach to evaluate detection algorithm for continuous speech recognition is presented. The speech/non-speech detection using energy parameter combin...

full text

modification of nanoclay for improving the physico-mechanical properties of dental adhesives

هدف اصلی این مطالعه تهیه یک سامانه نوین چسب عاجی دندانی بر پایه نانورس پیوند شده با پلی متاکریلیک اسید، نانورس پیوند شده با پلی اکریلیک اسید، مخلوط نانوسیلیکا و نانورس پیوند شده با پلی متاکریلیک اسید، مخلوط نانوسیلیکا و نانورس پیوند شده با پلی اکریلیک اسید و نانورس پیوند شده با کیتوسان اصلاح شده با گلایسیدیل متاکریلات است. پیوند پلی متاکریلیک اسید و پلی اکریلیک اسید بر ری سطح نانورس در حضور و ...

a comparative pragmatic analysis of the speech act of “disagreement” across english and persian

the speech act of disagreement has been one of the speech acts that has received the least attention in the field of pragmatics. this study investigates the ways power relations, social distance, formality of the context, gender, and language proficiency (for efl learners) influence disagreement and politeness strategies. the participants of the study were 200 male and female native persian s...

15 صفحه اول

assessment of the park- ang damage index for performance levels of rc moment resisting frames

چکیده هدف اصلی از طراحی لرزه ای تامین ایمنی جانی در هنگام وقوع زلزله و تعمیر پذیر بودن سازه خسارت دیده، پس از وقوع زلزله است. تجربه زلزله های اخیر نشان داده است که ساختمان های طراحی شده با آیین نامه های مبتنی بر نیرو از نظر محدود نمودن خسارت وارده بر سازه دقت لازم را ندارند. این امر سبب پیدایش نسل جدید آیین نامه های مبتنی بر عملکرد شده است. در این آیین نامه ها بر اساس تغییرشکل های غیرارتجاعی ...

15 صفحه اول

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 3  issue 2

pages  149- 156

publication date 2015-10-01

By following a journal you will be notified via email when a new issue of this journal is published.

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023