Reconciling pronunciation differences between the front-end and the back-end in the IBM speech synthesis system

نویسندگان

  • Wael Hamza
  • Ellen Eide
  • Raimo Bakis
چکیده

In this paper, methods for reconciling pronunciation differences between a rule-based front-end and the pronunciations observed in a database of recorded speech are presented. The methods are applied to the IBM Expressive Speech Synthesis System [1] for both unrestricted and limited-domain text-to-speech synthesis. One method is based on constructing a multiple pronunciation lattice for the given sentence and scoring it using word and phoneme n-gram statistics computed from the target speaker’s database. A second method consists of storing observed pronunciations and introducing them as alternates in the search. We compare the strengths and weaknesses of these two methods. Results show that improvements are achieved in both limited and unrestricted domains, with the largest gains coming in the limited-domain case.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reconciling Pronunciation Differences between the Front- End and Back-end in the Ibm Speech Synthesis System

In this paper, methods for reconciling pronunciation differences between a rule-based front-end and the pronunciations observed in a database of recorded speech are presented. The methods are applied to the IBM Expressive Speech Synthesis System [1] for both unrestricted and limited-domain text-to-speech synthesis. One method is based on constructing a multiple pronunciation lattice for the giv...

متن کامل

Computer Assisted Pronunciation Teaching (CAPT) and Pedagogy: Improving EFL learners’ Pronunciation Using Clear Pronunciation 2 Software

This study examined the impact of Clear Pronunciation 2 software on teaching English suprasegmental features, focusing on stress, rhythm and intonation. In particular, the software covers five topics in relation to suprasegmental features including consonant cluster, word stress, connected speech, sentence stress and intonation. Seven Iranian EFL learners participated in this study. The study l...

متن کامل

Optimization of Text-To-Speech pho posteriori signal co

One issue arising in text-to-phone conversion is inconsistency between its output and the phonetic time-alignment of the dataset, hindering the back-end’s ability to access the best units to synthesize a text. Some such inconsistency is inevitable because dataset labeling requires allowance for alternate pronunciations of words, while the front-end typically predicts a single pronunciation for ...

متن کامل

A Corpus-Based Concatenative Speech Synthesis System for Turkish

Speech synthesis is the process of converting written text into machine-generated synthetic speech. Concatenative speech synthesis systems form utterances by concatenating pre-recorded speech units. Corpus-based methods use a large inventory to select the units to be concatenated. In this paper, we design and develop an intelligible and natural sounding corpus-based concatenative speech synthes...

متن کامل

Assessment and Design Enhancement of the Front End Energy Absorption Mechanism of a Locomotive Based on an Impact Scenario

This research is concerned with weighing the behavior of the front end energy absorption system of a locally manufactured locomotive in crash situations. The causes of the extensive damages to the energy absorption apparatus that includes the crash element and the buffer are studied. By choosing the proper damage model the conditions of the accident is simulated by using the ABAQUS engineering ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004