Improved histogram-based feature compensation for robust speech recognition and unsupervised speaker adaptation
نویسنده
چکیده
Feature compensation for noise robust speech recognition becomes more effective if normalization of time-derivative parameters is taken into account. This paper describes an implementation of Delta-Cepstrum Normalization (DCN) that runs with only minimum response time. The proposed algorithm, referred to as Recursive DCN, provides word error rate improvements comparable to conventional DCN. Since DCN includes the procedure that adjusts the mismatch between the cepstrum part and the delta-cepstrum part, it works effectively even if only small amount of data can be used. We also investigate the possibility of applying DCN to unsupervised speaker adaptation. It is shown that DCN adaptation improves the recognition accuracy even without reference transcription of the adaptation data. Finally, DCN adaptation is combined with Feature-space Maximum Likelihood Linear Regression (FMLLR). It shows promising results in the batch mode experiments, although the improvement is rather small in the recursive mode.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملAn on-line acoustic compensation technique for robust speech recognition
In this work we report on the use of an on-line acoustic compensation technique for robust speech recognition. With this technique acoustic mismatch between training and actual conditions is reduced through acoustic mapping. At recognition stage, observation vectors delivered by the acoustic front-end are mapped into a reference acoustic space, while input data are exploited to update the stati...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملHistogram Equalization Utilizing Window-Based Smoothed CDF Estimation for Feature Compensation
In this letter, we propose a new histogram equalization method to compensate for acoustic mismatches mainly caused by corruption of additive noise and channel distortion in speech recognition. The proposed method employs an improved test cumulative distribution function (CDF) by more accurately smoothing the conventional order statisticsbased test CDF with the use of window functions for robust...
متن کامل