Speaker-independent speech recognition using micro segment spectrum integration

نویسنده

  • Kiyoaki Aikawa
چکیده

This paper proposes a new spectral estimation method for automatic speech recognition. The spectrum estimated with the conventional data window of around 30 ms shows harmonic structure in the voiced portions of speech data. The harmonic frequency interval is often comparable to the formant frequency interval for female voices with high F0, which results in spectral estimation error. The new idea is to estimate spectrum by taking the Lp norm of the time series of the spectrum obtained from a very short speech segment. The new method, called the micro-segment spectrum integration, provides (1) precise spectral estimation not a ected by harmonic structure, and (2) noise-robustness by suppressing noisy speech segments. Phoneme recognition experiments demonstrate that the micro-segment spectrum integration method outperforms conventional spectral estimation methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noise-invariant representation for speech signals

A new group-delay based spectral domain is explored for representation of speech signals and for extraction of robust features. The spectrum is computed using the group-delay functions defined on the autocorrelation of a short segment of speech. The features derived from this spectrum are easy to compute and are robust to the background noise. The invariance of the spectral shape to noise in th...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998