Bi-directional conversion between graphemes and phonemes using a joint N-gram model
نویسندگان
چکیده
We present in this paper a statistical model for languageindependent bi-directional conversion between spelling and pronunciation, based on joint grapheme/phoneme units extracted from automatically aligned data. The model is evaluated on spelling-to-pronunciation and pronunciation-tospelling conversion on the NetTalk database and the CMU dictionary. We also study the effect of including lexical stress in the pronunciation. Although a direct comparison is difficult to make, our model’s performance appears to be as good or better than that of other data-driven approaches that have been applied to the same tasks.
منابع مشابه
Can Chinese Phonemes Improve Machine Transliteration?: A Comparative Study of English-to-Chinese Transliteration Models
Inspired by the success of English grapheme-to-phoneme research in speech synthesis, many researchers have proposed phoneme-based English-to-Chinese transliteration models. However, such approaches have severely suffered from the errors in Chinese phoneme-to-grapheme conversion. To address this issue, we propose a new English-to-Chinese transliteration model and make systematic comparisons with...
متن کاملPronunciation of P with a Joint N-gram Model Grapheme-to-phonem
Pronunciation of proper names is known to be a difficult problem, but one of great practical importance for both speech synthesis and speech recognition. Recently a few data-driven grapheme-to-phoneme conversion techniques have been proposed to tackle this problem. In this paper we apply the joint n-gram model for bi-directional grapheme-to-phoneme conversion, which has already been shown to ac...
متن کاملEfficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling
This paper presents the successful results of applying joint sequence modeling in Thai grapheme-to-phoneme conversion. The proposed method utilizes Conditional Random Fields (CRFs) in two-stage prediction. The first CRF is used for textual syllable segmentation and syllable type prediction. Graphemes and their corresponding phonemes are then aligned using well-designed many-to-many alignment ru...
متن کاملHidden Markov models for grapheme to phoneme conversion
We propose a method for determining the canonical phonemic transcription of a word from its orthography using hidden Markov models. In the model, phonemes are the hidden states and graphemes the observations. Apart from one pre-processing step, the model is fully automatic. The paper describes the basic HMM framework and enhancements which use preprocessing, context dependent models and a sylla...
متن کاملOn Mispronunciation Lexicon Generation Using Joint-Sequence Multigrams in Computer-Aided Pronunciation Training (CAPT)
We investigate the use of joint-sequence multigrams to generate L2 mispronunciation lexicons for mispronunciation detection and diagnosis. In the joint-sequence framework, a pair of parallel strings (namely, the input string of either graphemes or phonemes of the canonical pronunciation and the phonetic string of the mispronunciation) are aligned to form joint units for probabilistic estimation...
متن کامل