Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

author

  • S. Sharifian and S. M. Ahadi
Abstract:

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results are desirable for small training data, but with increasing training data, the performance improvement reaches the saturation lvel. In this paper, a new approach is introduced that makes use of the advantages of both mentioned techniques to improve the recognition rate. Here, the models with available training data are trained using MAP while&#10&#10for those with insufficient training data, appropriate prior parameters for MAP estimation are found using MLLR. This technique has yielded better performance in comparison to either MAP or MLLR, in a system based on FARSDAT speech corpus.&#10

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

Speaker adaptation for HMM-based speech synthesis system using MLLR

This paper describes a voice characteristics conversion technique for an HMM-based text-to-speech synthesis system. The system uses phoneme HMMs as the speech synthesis units, and voice characteristics conversion is achieved by changing HMM parameters appropriately. To transform the voice characteristics of synthetic speech to the target speaker, we apply an MLLR (Maximum Likelihood Linear Regr...

full text

Speaker Adaptation Using Lattice-based MLLR

This paper presents lattice-based maximum likelihood linear regression (MLLR) for unsupervised adaptation. Lattice MLLR accumulates the statistics used in the MLLR transform estimation procedure using a forward-backward pass through a word-lattice of alternative hypotheses rather than assuming that the 1-best transcription is accurate as in standard unsupervised MLLR. This results in the abilit...

full text

Regression class selection and speaker adaptation with MLLR in Mandarin continuous speech recognition

Currently, CDHMM based continuous speech recognition has been widely extended to speaker-independent (SI) system. However, the performance of the SI system is highly dependent on the speakers, especially for Mandarin speech with accent, speaker adaptation becomes crucial important for real application. In this paper, MLLR approach is studied for speaker adaptation in mandarin continuous speech ...

full text

Rapid speaker adaptation for continuous speech recognition using merging eigenvoices

Speaker adaptation in eigenvoice space is a popular method for rapid speaker adaptation. To improve the performance of the method and to obtain stabilized results, the number of speaker-dependent models should be increased and a greater number of eigenvoices should be re-estimated. However, the huge computation time required to find eigenvoices makes these solutions difficult, especially in a c...

full text

Improved MLLR speaker adaptation using confidence measures for conversational speech recognition

Automatic recognition of conversational speech tends to have higher word error rates (WER) than read speech. Improvements gained from unsupervised speaker adaptation methods like Maximum Likelihood Linear Regression (MLLR) [1] are reduced because of their sensitivity to recognition errors in the first pass. We show that a more detailed modeling of adaptation classes and the use of confidence me...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 23  issue 2

pages  39- 50

publication date 2005-01

By following a journal you will be notified via email when a new issue of this journal is published.

Keywords

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023