Rapid unsupervised speaker adaptation using single utterance based on MLLR and speaker selection
نویسندگان
چکیده
In this paper, we employ the concept of HMM-Sufficient Statistics (HMM-Suff Stat) and N-best speakers selection to realize a rapid implementation of Baum-Welch and MLLR. Only a single arbitrary utterance is required which is used to select the N-best speakers HMM-Suff Stat from the training database as adaptation data. Since HMM-Suff Stat are pre-computed offline, computation load is minimized. Moreover, adaptation data from the target speaker is not needed. An absolute improvement of 1.8 % WA is achieved when using the rapid Baum-Welch as opposed to using SI model and an improvement of 1.1 % WA is achieved when the rapid MLLR is used compared to rapid Baum-Welch adaptation using HMM-Suff Stat. Adaptation time is as fast as 6 sec and 7 sec respectively. Evaluation is done in noisy environment conditions where the adaptation algorithm is integrated in a speech dialogue system. Additional experiments with VTLN, MAP, and the conventional MLLR are performed.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملRapid unsupervised speaker adaptation based on multi-template HMM sufficient statistics in noisy environments
This paper describes a multi-template unsupervised speaker adaptation based on HMM-Sufficient Statistics. Multiple class-dependent models based on gender and age are used to push up the adaptation performance while keeping adaptation time within few seconds with just one arbitrary utterance. Adaptation begins with the estimation of speaker‘s class from the N-best neighbor speakers using Gaussia...
متن کاملDoctoral Dissertation Rapid Unsupervised Speaker Adaptation Based on Sufficient Statistics of Hidden Markov Models
In realizing a speech recognition system robust to variation of speakers, an efficient adaptation algorithm is needed. Most adaptation techniques require many adaptation data to carry out an adaptation task. Adaptation data are often collected from the actual speaker itself in several utterances. With the time needed to gather and transcribe the adaptation utterances, together with the actual e...
متن کاملUnsupervised speaker adaptation based on sufficient HMM statistics of selected speakers
This paper describes an efficient method for unsupervised speaker adaptation. This method is based on (1) selecting a subset of speakers who are acoustically close to a test speaker, and (2) calculating adapted model parameters according to the previously stored sufficient HMM statistics of the selected speakers’ data. In this method, only a few unsupervised test speaker’s data are required for...
متن کامل