Robust speech recognition features based on temporal trajectory filtering of frequency band spectrum
نویسندگان
چکیده
This paper presents the use of a variety of lters in the temporal trajectories of frequency band spectrum to extract speech recognition features for environmental robustness. Three kind of lters for emphasizing the statistically important parts of speech are proposed. First, a bank of RASTA-like band-pass lters to t the statistical peaks of modulation frequency band spectrum of speech are used. Secondly, a three-channel octave bandlter band with a smoothed rectangular window spline is applied. Thirdly, a datadriven lter is developed. Experimental results show that signi cant improvements for speech recognition using the proposed feature extraction approach under noisy environments can be achieved.
منابع مشابه
Robust Speech Recognition Features Based on Temporal Trajectory Filtering and Non-Uniform Spectral Compression
This paper proposes a new feature extraction method based on temporal trajectory filtering and nonuniform spectral compression and examines its performance with two tasks in noisy environments. Temporal trajectory filtering is effective for robust speech recognition in noisy environments, due to human hearing is more sensitive to relative values rather than absolute values and the effect of add...
متن کاملAuditory Contrast Spectrum for Robust Speech Recognition
Traditional speech representations are based on power spectrum which is obtained by energy integration from many frequency bands. Such representations are sensitive to noise since noise energy distributed in a wide frequency band may deteriorate speech representations. Inspired by the contrast sensitive mechanism in auditory neural processing, in this paper, we propose an auditory contrast spec...
متن کاملOn factorizing spectral dynamics for robust speech recognition
In this paper, we introduce new dynamic speech features based on the modulation spectrum. These features, termed Mel-cepstrum Modulation Spectrum (MCMS), map the time trajectories of the spectral dynamics into a series of slow and fast moving orthogonal components, providing a more general and discriminative range of dynamic features than traditional delta and acceleration features. The feature...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملMulti-resolution RASTA filtering for TANDEM-based ASR
New speech representation based on multiple filtering of temporal trajectories of speech energies in frequency sub-bands is proposed and tested. The technique extends earlier works on delta features and RASTA filtering by processing temporal trajectories by a bank of band-pass filters with varying resolutions. In initial tests on OGI Digits database the technique yields about 30% relative impro...
متن کامل