Method for Asr Performance Prediction Based on Temporal Properties of Speech Signal

نویسندگان

Hynek Hermansky

Vijayaditya Peddinti

چکیده

Extending previous work on prediction of phoneme recognition error from unlabelled data, corrupted by unpredictable factors, the current work investigates a simple but effective method of estimating ASR performance by computing Mean Temporal Distance (MTD), which is the mean distance between speech feature vectors, determined as a function of temporal distance between the vectors. It is shown that MTD is a function of the signal-to-noise ratio of the speech signal. Comparing MTD curves, derived on data used for training of the classifier, and on test utterances, allows for predicting error on the test data. Another interesting observation from the proposed technique is that the Mean Temporal Distance remains approximately constant, as temporal separation exceeds certain critical interval (about 200 ms), corresponding to the extent of coarticulation in speech sounds. This lends further support to the notion that speech message is coded in overlapping speech sound units, lasting approximately 200 ms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بهبود عملکرد سیستم بازشناسی گفتار پیوسته بوسیله ویژگی‌های استخراج شده از مانیفولدهای گفتاری در فضای بازسازی شده فاز

The design for new feature extraction methods out of the speech signal and combination of their obtained information is one of the most effective approaches to improve the performance of automatic speech recognition (ASR) system. Recent researches have been shown that the speech signal contains nonlinear and chaotic properties, but the effects of these properties are not used in the continuous ...

متن کامل

Automatic speech recognition with primarily temporal envelope information

The aim of this study is to devise a computational method to predict cochlear implant (CI) speech recognition. Here, we describe a high-throughput screening system for optimizing CI speech processing strategies using hidden Markov model (HMM)-based automatic speech recognition (ASR). Word accuracy was computed on vocoded CI speech synthesized from primarily multi-channel temporal envelope infor...

متن کامل

Hilbert Envelope Based Features for Far-Field Speech Recognition

Automatic speech recognition (ASR) systems, trained on speech signals from close-talking microphones, generally fail in recognizing far-field speech. In this paper, we present a Hilbert Envelope based feature extraction technique to alleviate the artifacts introduced by room reverberations. The proposed technique is based on modeling temporal envelopes of the speech signal in narrow sub-bands u...

متن کامل

Multi-step linear prediction based speech dereverberation in noisy reverberant environment

A speech signal captured by a distant microphone is generally contaminated by reverberation and background noise, which severely degrade the automatic speech recognition (ASR) performance. In this paper, we first extend a previously proposed single channel dereverberation algorithm to a multi-channel scenario. The method estimates late reflections using multichannel multi-step linear prediction...

متن کامل

Prediction of Epileptic Seizures in Patients with Temporal Lobe Epilepsy (TLE) based on Cepstrum analysis and AR model of EEG signal

Epilepsy is a chronic disorder of brain function caused by abnormal and excessive electrical neurons discharge in the brain. Seizures cause disturbances in consciousness that occur without prior notice, so their prediction ability, based on EEG data, can reduce stress and improve quality of life. An epileptic patient EEG data consists of five parts: Ictal, Inter-Ictal, pre-Ictal, Post-Ictal, an...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Method for Asr Performance Prediction Based on Temporal Properties of Speech Signal

نویسندگان

چکیده

منابع مشابه

بهبود عملکرد سیستم بازشناسی گفتار پیوسته بوسیله ویژگی‌های استخراج شده از مانیفولدهای گفتاری در فضای بازسازی شده فاز

Automatic speech recognition with primarily temporal envelope information

Hilbert Envelope Based Features for Far-Field Speech Recognition

Multi-step linear prediction based speech dereverberation in noisy reverberant environment

Prediction of Epileptic Seizures in Patients with Temporal Lobe Epilepsy (TLE) based on Cepstrum analysis and AR model of EEG signal

عنوان ژورنال:

اشتراک گذاری