Entropy based combination of tandem representations for noise robust ASR
نویسندگان
چکیده
In this paper, we present an entropy based method to combine tandem representations of the recently proposed Phase AutoCorrelation (PAC) based features and MelFrequency Cepstral Coefficients (MFCC) features. PAC based features, derived from a nonlinear transformation of autocorrelation coefficients and shown to be noise robust, improve their robustness to additive noise in their tandem representation. On the other hand, MFCC features in their tandem representation show a significant improvement in recognition performance on clean speech. An entropy based combination method investigated in this paper adaptively gives a higher weighting to the representation of MFCC features in clean speech and to the representation of PAC based features in noisy speech, thus yielding a robust recognition performance in all conditions.
منابع مشابه
New entropy based combination rules in HMM/ANN multi-stream ASR
Classifier performance is often enhanced through combining multiple streams of information. In the context of multistream HMM/ANN systems in ASR, a confidence measure widely used in classifier combination is the entropy of the posteriors distribution output from each ANN, which generally increases as classification becomes less reliable. The rule most commonly used is to select the ANN with the...
متن کاملSpectro-temporal activity pattern (STAP) features for noise robust ASR
In this paper, we introduce a new noise robust representation of speech signal obtained by locating points of potential importance in the spectrogram, and parameterizing the activity of time-frequency pattern around those points. These features are referred to as Spectro-Temporal Activity Pattern (STAP) features. The suitability of these features for noise robust speech recognition is examined ...
متن کاملSpectral Entropy Feature in Multi-Stream for Robust ASR
In recent papers, entropy computed from sub-bands of the spectrum was used as a feature for automatic speech recognition. In the present paper, we further study the sub-band spectral entropy features which can give the flatness/peakiness of the sub-band spectrum and in turn the position of the formants in the spectrum. The sub-band spectral entropy features are used in hybrid hidden Markov mode...
متن کاملFrom Multi-Band Full Combination to Multi-Stream Full Combination Processing in Robust ASR
The multi-band processing paradigm for noise robust ASR was originally motivated by the observation that human recognition appears to be based on independent processing of separate frequency sub-bands, and also by “missing data” results which have shown that ASR can be made significantly more robust to band-limited noise if noisy sub-bands can be detected and then ignored. Of the different mult...
متن کاملInvariant Representations for Noisy Speech Recognition
Modern automatic speech recognition (ASR) systems need to be robust under acoustic variability arising from environmental, speaker, channel, and recording conditions. Ensuring such robustness to variability is a challenge in modern day neural network-based ASR systems, especially when all types of variability are not seen during training. We attempt to address this problem by encouraging the ne...
متن کامل