Learning new word pronunciations from spoken examples

نویسندگان

  • Ibrahim Badr
  • Ian McGraw
  • James R. Glass
چکیده

A lexicon containing explicit mappings between words and pronunciations is an integral part of most automatic speech recognizers (ASRs). While many ASR components can be trained or adapted using data, the lexicon is one of the few that typically remains static until experts make manual changes. This work takes a step towards alleviating the need for manual intervention by integrating a popular grapheme-to-phoneme conversion technique with acoustic examples to automatically learn highquality baseform pronunciations for unknown words. We explore two models in a Bayesian framework, and discuss their individual advantages and shortcomings. We show that both are able to generate better-than-expert pronunciations with respect to word error rate on an isolated word recognition task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonological ambiguity and lexical ambiguity: effects on visual and auditory word recognition.

Three experiments in Serbo-Croatian were conducted on the effects of phonological ambiguity and lexical ambiguity on printed word recognition. Subjects decided rapidly if a printed and a spoken word matched or not. Printed words were either phonologically ambiguous (two possible pronunciations) or unambiguous. If phonologically ambiguous, either both pronunciations were real words or only one w...

متن کامل

Pronunciation modeling for large vocabulary conversational speech recognition

In this paper, we address the issue of deriving and using more realistic pronunciations to represent words spoken in natural conversational speech. Previous approaches include using automatic phoneme-based rule-learning techniques [1, 2, 7], linguistic transformation rules [4, 8], and phonetically hand-labelled corpus [3] to expand the number of pronunciation variants per word. While rule-based...

متن کامل

Learning linguistically valid pronunciations from acoustic data

We describe an algorithm to learn word pronunciations from acoustic data. The algorithm jointly optimizes the pronunciation of a word using (a) the acoustic match of this pronunciation to the observed data, and (b) how “linguistically reasonable” the pronunciation is. Variations of word pronunciations in the recognition dictionary (which was created by linguists), are used to train a model of w...

متن کامل

Flavoured acoustic model and combined spelling to sound for asymmetrical bilingual environment

The most common target of multilingual ASR aims at covering various speakers from various languages. The problem we address in this article is more specifically an asymmetrical bilingual scenario, where the same speaker may insert in his speech some foreign words using foreign pronunciations. This is a frequent situation for French as spoken in Canada, where English proper names are often spoke...

متن کامل

A probabilistic approach to pronunciation by analogy

The relationship between written and spoken words is convoluted in languages with a deep orthography such as English and therefore it is difficult to devise explicit rules for generating the pronunciations for unseen words. Pronunciation by analogy (PbA) is a data-driven method of constructing pronunciations for novel words from concatenated segments of known words and their pronunciations. PbA...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010