String-level MCE for continuous phoneme recognition
نویسندگان
چکیده
In this paper, we present results for the Minimum Classi cation Error (MCE) [1] framework for discriminative training applied to tasks in continuous phoneme recognition. The results obtained using MCE are compared with results for Maximum Likelihood Estimation (MLE). We examine the ability of MCE to attain high recognition performance with a small number of parameters. Phoneme-level and string-level MCE loss functions were used as the optimization criteria for a PrototypeBased Minimum Error Classi er (PBMEC) [2] and an HMM [3]. The former was optimized using Generalized Probabilistic Descent, the latter was optimized using an approximated second order method, the Quickprop algorithm. Two databases were used in this evaluation: 1) the ATR 5240 isolated word datasets for 6 speakers, in both speaker-dependent and multi-speaker mode; 2) the TIMIT database. For both databases, MCE training yielded striking gains in performance and classi er compactness compared to MLE baselines. For instance, through MCE training, performance similar to that of the Maximum Likelihood Successive State Splitting algorithm (ML-SSS) [4] could be obtained with 20 times fewer parameters.
منابع مشابه
Phone-discriminating minimum classification error (p-MCE) training for phonetic recognition
In this paper, we report a study on performance comparisons of discriminative training methods for phone recognition using the TIMIT database. We propose a new method of phonediscriminating minimum classification error (P-MCE), which performs MCE training at the sub-string or phone level instead of at the traditional string level. Aiming at minimizing the phone recognition error rate, P-MCE nev...
متن کاملIntegrating multiple pronunciations during MCE-based acoustic model training for large vocabulary speech recognition
In this paper, we report on the implementation of an automatic method for discovering an appropriate pronunciation for each speech utterance of every speaker and integrating this new information into minimum classification error (MCE) based training algorithm. The proposed method allows a lot more flexibility in adapting multiple pronunciations during the existing supervised acoustic model trai...
متن کاملRobust HMM training for unified dutch and German speech recognition
This paper describes our recent work in developing an unified Dutch and German speech recognition system in the SpeechDat domain. The acoustic component of the multiligual system is accomplished through sharing common phonemes without preserving any information about the languages. We propose a more robust MCE-based training algorithm, where only the language dependent phoneme models are allowe...
متن کاملIn 5. References Mutual Information Estimation and the Speech Re
the following optimization of the discriminative functions the extended BW algorithm (12) was applied to re-estimate the mixtures of the phoneme models. All phoneme alternatives within the automatically derived phoneme seg-mentation were used in the calculation of the discriminance measure. According to (6), the mixtures of the correct and of all competing models were updated to minimize the ob...
متن کاملReaction Time in Phoneme Recognition: A Comparative Study among Iranian Upper-Intermediate vs. Advanced EFL Learners at Institute Level
The present study aimed to investigate of reaction time in terms of phoneme recognition: A comparative study among Iranian Upper-Intermediate vs. Advanced EFL Learners at Institute level. The main question this study tried to answer was whether there is no difference in reaction time in terms of phoneme recognition in Iranian learners at Institute level. To answer the question, 5Upper-Intermedi...
متن کامل