Evaluating the pronunciation component of text-to-speech systems for English: a performance comparison of different approaches

نویسندگان

  • Robert I. Damper
  • Yannick Marchand
  • Martin J. Adamson
  • Kjell Gustafson
چکیده

The automatic derivation of word pronunciations from input text is a central task for any text-to-speech system. For general English text at least, this is often thought to be a solved problem, with manually-derived linguistic rules assumed capable of handling “novel” words missing from the system dictionary. Data-driven methods, based on machine learning of the regularities implicit in a large pronouncing dictionary, have received considerable attention recently but are generally thought to perform less well. However, these tentative beliefs are at best uncertain without powerful methods for comparing text-to-phoneme subsystems. This paper contributes to the development of such methods by comparing the performance of four representative approaches to automatic phonemization on the same test dictionary. As well as rule-based approaches, three data-driven techniques are evaluated: pronunciation by analogy (PbA), NETspeak and IB1-IG (a modified k-nearest neighbour method). Issues involved in comparative evaluation are detailed and elucidated. The data-driven techniques outperform rules in accuracy of letter-to-phoneme translation by a very significant margin but require aligned text-phoneme training data and are slower. Best translation results are obtained with PbA at approximately 72% words correct on a resonably large pronouncing dictionary, compared with something like 26% words correct for the rules, indicating that automatic pronunciation of text is not a solved problem. c © 1999 Academic Press ∗Based on a paper presented at UK Speech and Language Technology (SALT) Club Workshop on Evaluation in Speech and Language Technology, Sheffield, July 1997. ‖E-mails: {rid|ym|mja95r}@ecs.soton.ac.uk and [email protected] 0885–2308/99/020155 + 22 $30.00/0 c © 1999 Academic Press 156 R. I. Damper et al.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating the Pronunciation Component of Text-to-Speech Systems for English: A Performance Comparison

The automatic derivation of word pronunciations from input text is a central task for any text-to-speech system. For general English text at least, this is often thought to be a solved problem, with manually-derived linguistic rules assumed capable of handling ‘novel’ words missing from the system dictionary. Data-driven methods, based on machine learning of the regularities implicit in a large...

متن کامل

Computer Assisted Pronunciation Teaching (CAPT) and Pedagogy: Improving EFL learners’ Pronunciation Using Clear Pronunciation 2 Software

This study examined the impact of Clear Pronunciation 2 software on teaching English suprasegmental features, focusing on stress, rhythm and intonation. In particular, the software covers five topics in relation to suprasegmental features including consonant cluster, word stress, connected speech, sentence stress and intonation. Seven Iranian EFL learners participated in this study. The study l...

متن کامل

Game-based Teaching of Stress Placement on Multi-syllabic English Words

Accurate pronunciation is an important component of language ability and the main outward linguistic sign of whether someone is a native speaker of a language or not. An area of particular difficulty for Persian-speaking learners of English, which may cause 'foreign accent' or misunderstanding in speaking, is placement of stress on multi-syllable words. Game-based pronunciation teaching can be ...

متن کامل

Advantages of Using Computer in Teaching English Pronunciation

Pronunciation continues to grow in importance because of its key roles in speech recognition, speech perception, and speaker identity. Computer is being increasingly used in teaching English pronunciation to enhance its quality. The purpose of this paper is to discuss the advantages of using computer in English pronunciation instruction. Understanding the advantages of computer is an important ...

متن کامل

DRAMA TRANSLATION from PAGE to STAGE

This study is the result of an attempt to investigate the differences between the Persian translated drama text (page) of each English drama text with its performance on the stage (stage) in Iran. In other words, the present researchers tried to find the implemented changes in page which make it real on the stage in the target language and culture in order to show that in drama translating and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Speech & Language

دوره 13  شماره 

صفحات  -

تاریخ انتشار 1999