On the Potential for Robust ASR with Combined Subband-Waveform and Cepstral Features
نویسندگان
چکیده
This work explores the potential for robust classification of phonemes in the presence of additive noise and linear filtering using high-dimensional features in the subbands of acoustic waveforms. The proposed technique is compared with state-of-the-art automatic speech recognition (ASR) front-ends on the TIMIT phoneme classification task using support vector machines (SVMs). The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. Experiments demonstrate the benefits of the classification in the subbands of acoustic waveforms: it outperforms the standard cepstral front-end in the presence of noise and linear filtering for all signal-to-noise ratios (SNRs) below a crossover point between 12dB and 6dB. Combining the subband-waveform and cepstral classifiers achieves further performance improvements over both individual classifiers.
منابع مشابه
Third-Order Moments of Filtered Speech Signals for Robust Speech Recognition
Novel speech features calculated from third-order statistics of subband-filtered speech signals are introduced and studied for robust speech recognition. These features have the potential to capture nonlinear information not represented by cepstral coefficients. Also, because the features presented in this paper are based on the third-order moments, they may be more immune to Gaussian noise tha...
متن کاملA Subband-Based SVM Front-End for Robust ASR
This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) front-end that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed....
متن کاملA High-Dimensional Subband Speech Representation and SVM Framework for Robust Speech Recognition
This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) frontend that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. ...
متن کاملSpectral subband centroid features for speech recognition
Cepstral coefficients derived either through linear prediction (LP) analysis or from filter bank are perhaps the most commonly used features in currently available speech recognition systems. In this paper, we propose spectral subband centroids as new features and use them as supplement to cepstral features for speech recognition. We show that these features have properties similar to formant f...
متن کاملA Robust SAR NLFM Waveform Selection Based on the Total Quality Assessment Techniques
Design, simulation and optimal selection of cosine-linear frequency modulation waveform (CNLFM) based on correlated ambiguity function (AF) method for the purpose of Synthetic Aperture Radar (SAR) is done in this article. The selected optimum CNLFM waveform in contribution with other waveforms are applied directly into a SAR image formation algorithm (IFA) and their quality effects performance ...
متن کامل