Improving the Role of Unvoiced Spectral Normalisation in Robu

نویسندگان

  • Carlos Lima
  • Luís B. Almeida
  • João L. Monteiro
چکیده

This paper presents a spectral normalisation based method for extraction of speech robust features in additive noise. The method has two main goals: 1) The “peaked” spectral zones, where the most speech energy is concentrated must be preserved (from clean to noisy speech features) as much as possible by the feature extraction process. Usually, these spectral regions are the most reliable due to the higher speech energy, and the frequently assumption of independence between speech and noise. 2) The speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Usually these speech regions correspond to unvoiced speech where are included nearly half of the consonants. The proposed normalisation will be optimal if the corrupted and the noise process have both white noise characteristics. Optimal normalisation means that the corrupting noise does not change at all the means of the observed vectors of the corrupted process. For Signal to Noise Ratio greater than 5 dB the results show that for stationary white noise, the proposed normalisation process where the noise characteristics are ignored, outperforms the conventional Markov models composition where the noise must be known. Additionally, if the noise is known, a reasonable approximation of the inverted system can easily be obtained by performing noise compensation and still increasing the recogniser performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition

This paper presents a spectral normalisation based method for extraction of speech robust features in additive noise. The method has two main goals: 1) The “peaked” spectral zones, where the most speech energy is concentrated must be preserved (from clean to noisy speech features) as much as possible by the feature extraction process. Usually, these spectral regions are the most reliable due to...

متن کامل

Spectral Normalisation MFCC Derived Features for Robust Speech Recognition

This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are i...

متن کامل

Unvoiced speech segregation based on CASA and spectral subtraction

Unvoiced speech separation is an important and challenging problem that has not received much attention. We propose a CASA based approach to segregate unvoiced speech from nonspeech interference. As unvoiced speech does not contain periodic signals, we first remove the periodic portions of a mixture including voiced speech. With periodic components removed, the remaining interference becomes mo...

متن کامل

Spectral-spatial classification of hyperspectral images by combining hierarchical and marker-based Minimum Spanning Forest algorithms

Many researches have demonstrated that the spatial information can play an important role in the classification of hyperspectral imagery. This study proposes a modified spectral–spatial classification approach for improving the spectral–spatial classification of hyperspectral images. In the proposed method ten spatial/texture features, using mean, standard deviation, contrast, homogeneity, corr...

متن کامل

Perceptual experiments on enhanced and slowed down speech sentences for second language acquisition

This paper investigates the perception of speech signals that have been enhanced and slowed down selectively, with the view of improving oral comprehension for second language acquisition. Our modifications are applied on a small number of acoustic cues, i.e. bursts of unvoiced stops, unvoiced fricative noises and rapid spectral transition regions. Bursts and frication noises were amplified, an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002