Improving the Performance of a Dutch Csr by Modeling Pronunciation Variation
نویسندگان
چکیده
This paper describes how the performance of a continuous speech recognizer for Dutch has been improved by modeling pronunciation variation. We used three methods in order to model pronunciation variation. First, withinword variation was dealt with. Phonological rules were applied to the words in the lexicon, thus automatically generating pronunciation variants. Secondly, cross-word pronunciation variation was accounted for by adding multi2. METHOD AND MATERIAL words and their variants to the lexicon. Thirdly, probabilities of pronunciation variants were incorporated in the language model (LM), and thresholds were used to choose which pronunciation variants to add to the LMs. For each of the methods, recognition experiments were carried out. A significant improvement in error rates was measured.
منابع مشابه
Improving the performance of a Dutch CSR by modeling within-word and cross-word pronunciation variation
This article describes how the performance of a Dutch continuous speech recognizer was improved by modeling pronunciation variation. We propose a general procedure for modeling pronunciation variation. In short, it consists of adding pronunciation variants to the lexicon, retraining phone models and using language models to which the pronunciation variants have been added. First, within-word pr...
متن کاملModeling pronunciation variation for a dutch CSR: testing three methods
This paper describes how the performance of a continuous speech recognizer for Dutch has been improved by modeling pronunciation variation. We used three methods to model pronunciation variation. First, within-word variation was dealt with. Phonological rules were applied to the words in the lexicon, thus automatically generating pronunciation variants. Secondly, cross-word pronunciation variat...
متن کاملModeling Within-word and Cross-word Pronunciation Variation to Improve the Performance of a Dutch Csr
This paper describes how the performance of a continuous speech recognizer for Dutch has been improved by modeling within-word and cross-word pronunciation variation. Within-word variants were automatically generated by applying five phonological rules to the words in the lexicon. For the within-word method, a significant improvement is found compared to the baseline. Cross-word pronunciation v...
متن کاملMaking a difference On automatic transcription and modeling of Dutch pronunciation variation for automatic speech recognition
The first goal of this study is to investigate the effect of several properties of acontinuous speech recognizer (CSR) on automatic phonetic transcription. Our resultsshow that changing certain properties of the CSR affects the resulting automatictranscriptions. The quality of the automatic transcriptions can be improved by using‘short’ HMMs and by reducing the amount of contami...
متن کاملComparison between Expert Listeners and Continuous Speech Recognizers in Selecting Pronunciation Variants
In this paper, the performance of an automatic transcription tool corpus is by modeling pronunciation variation [2]. is evaluated. The transcription tool is a continuous speech Another way of obtaining models which are less recognizer (CSR) which can be used to select pronunciation contaminated is to train PMs on read speech. It is well known variants (i.e. detect insertions and deletions of ph...
متن کامل