Maximum expected likelihood based model selection and adaptation for nonnative English speakers
نویسندگان
چکیده
In this paper, the problem of fast model adaptation for nonnative speakers is addressed from a perspective of model complexity selection. The key challenge lies in reliable complexity selection when only a small amount of adaptation data is available. A novel maximum expected likelihood (MEL) based technique is proposed to enable model complexity selection from using as little as one adaptation sentence. In MEL, the expectation of loglikelihood is computed based on the mismatch bias between model and data which is measured by a small amount of adaptation data, and model complexity is selected to maximize EL. Experiments were performed on WSJ data of speakers with a wide range of foreign accents. The proposed method led to consistent and significant improvement on recognition accuracy over MLLR for nonnative speakers, without performance degradation on native speakers. The proposed method was able to dynamically select optimal model complexity as the available adaptation data increased.
منابع مشابه
Fast model selection based speaker adaptation for nonnative speech
In this paper, the problem of adapting acoustic models of native English speech to nonnative speakers is addressed from a perspective of adaptive model complexity selection. The goal is to dynamically select model complexity for each nonnative talker so as to optimize the balance between model robustness to pronunciation variations and model detailedness for discrimination of speech sounds. A m...
متن کاملMaximum Expected Likelihood B and Adaptation for Nonnativ
In this paper, the problem of fast model adaptation for nonnative speakers is addressed from a perspective of model complexity selection. The key challenge lies in reliable complexity selection when only a small amount of adaptation data is available. A novel maximum expected likelihood (MEL) based technique is proposed to enable model complexity selection from using as little as one adaptation...
متن کاملPrior knowledge guided maximum expected likelihood based model selection and adaptation for nonnative speech recognition
In this paper, an improved method of model complexity selection for nonnative speech recognition is proposed by using maximum a posteriori (MAP) estimation of bias distributions. An algorithm is described for estimating hyper-parameters of the priors of the bias distributions, and an automatic accent classification algorithm is also proposed for integration with dynamic model selection and adap...
متن کاملA Two-stage Speaker Adaptation Approach for Subspace Gaussian Mixture Model based Nonnative Speech Recognition
Nonnative speech recognition is becoming more and more important as many speech applications are deployed world wide. Meanwhile, due to the large population of nonnative speakers, speaker adaptation remains the most practical way for providing high performance speech services. Subspace Gaussian Mixture Model (SGMM) has recently been shown to yield superior performance on various native speech r...
متن کاملA comparative study of speaker adaptation techniques
In previous work, we showed how to constrain the estimation of continuous mixture-density hidden Markov models (HMMs) when the amount of adaptation data is small. We used maximum-likelihood (ML) transformation-based approaches and Bayesian techniques to achieve near native performance when testing nonnative speakers of the recognizer language. In this paper, we study various ML-based techniques...
متن کامل