HMM-based Pronunciation Dictionary Generation
نویسندگان
چکیده
In this paper, we discuss automatically generating a phonetic pronunciation from an orthographic spelling of words. The letter-sequence to phoneme-sequence mapping is useful in a variety of contexts, including text-to-speech applications, automatic spelling correction, and generating a pronunciation lexicon for a new training dataset which contains out-of-vocabulary words. A system based on hidden Markov models is described, and is then used to generate pronunciations for outof-vocabulary the words and word fragments in the Fisher conversational telephone speech corpus. The Fisher phonetic pronunciations are analyzed to show that for conversational speech and a typical phonetic dictionary, a large amount of lexical ambiguity remains even when the word boundaries and phonetic transcriptions are known.
منابع مشابه
Accuracy Analysis of Generalized Pronunciation Variant Selection in ASR Systems
Automated speech recognition systems work typically with pronunciation dictionary for generating expected phonetic content of particular words in recognized utterance. But the pronunciation can vary in many situations. Besides the cases with more possible pronunciation variants specified manually in the dictionary there are typically many other possible changes in the pronunciation depending on...
متن کاملAutomatic generation and pruning of phonetic mispronunciations to support computer-aided pronunciation training
This paper presents a mispronunciation detection system which uses automatic speech recognition to support computer-aided pronunciation training (CAPT). Our methodology extends a model pronunciation lexicon with possible phonetic mispronunciations that may appear in learners’ speech. Generation of these pronunciation variants was previously achieved by means of phone-tophone mapping rules deriv...
متن کاملSpeaker adaptation using a parallel phone set pronunciation dictionary for Thai-English bilingual TTS
This paper develops a bilingual Thai-English TTS system from two monolingual HMM-based TTS systems. An English Nagoya HMM-based TTS system (HTS) provides correct pronunciations of English words but the voice is different from the voice in a Thai HTS system. We apply a CSMAPLR adaptation technique to make the English voice sounds more similar to the Thai voice. To overcome a phone mapping proble...
متن کاملSignal driven generation of word baseforms from few examples
The work described in this paper attempts to automatically generate word baseforms as used in the pronunciation dictionaries of large vocabulary speech recognition systems. The input to the algorithm consists of several sample utterances per word. No additional information, like e.g. word spelling, is used. The task involves determining a suitable inventory of subword units (SWU) as well as det...
متن کاملPronunciation variation speech recognition without dictionary modification on sparse database
Generally, a speech recognition system uses a fixed set of pronunciations according to the dictionary for training and decoding. However, even a well-defined lexicon cannot be used to support all variations in human’s pronunciation. Besides, in order to cover all possible pronunciations, the size of the dictionary would be too large to implement. Sharing gaussian densities across phonetic model...
متن کامل