Automatic corpus-based training of rules for prosodic generation in text-to-speech

نویسندگان

  • Eduardo López Gonzalo
  • Jose M. Rodriguez-Garcia
  • Luis A. Hernández Gómez
  • Juan Manuel Villar-Navarro
چکیده

In this paper, we discuss a methodology for automatic prosodic modeling in Text-to-Speech (TTS) systems. The proposed methodology can be seen as a data-driven strategy to train prosodic rules from the automatic analysis of a specific text and its related speech material. Therefore, our corpus-based training procedure is based on an automatic linguistic analysis of the text and on an acoustic analysis of the speech using automatic speech recognition techniques. Together with the automatic derivation of prosodic rules, our method can be easily extended to obtain specific grammar categories suitable for accurate prosodic modeling of specific tasks. Evaluation results over two different applications and speaker styles, reveal that the proposed automatic prosodic generation procedure is able to provide a noticeable increase in naturalness when adapting TTS system to a new speaker and a new speaking style.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concept-to-speech generation by integrating syntagmatic features into HMM-based speech synthesis

In conventional concept-to-speech (CTS) methods, a common step is predicting abstract prosodic descriptions, such as the locations of accents and phrase boundaries, from the linguistic information provided by the text generation module. But the prediction results always contain errors, and unacceptable prosodic prediction may ruin the synthesized speech. In addition, linguistic information, whi...

متن کامل

A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese

This paper presents a set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese. A large speech corpus produced by a single speaker is used, and the speech output is synthesized from waveform units of variable lengths, with desired linguistic properties, retrieved from this corpus. Detailed methodologies were developed for designing “phonetically rich” and “prosodically ric...

متن کامل

Automatic Detection of Prosody Phrase Boundaries for Text-to-Speech System

Automatic acquisition of the prosodic phrase boundary detecting rules from the text and speech corpora has always been a difficulty for TTS systems. We collected over 5,000 sentences as the corpus, introduced a method based on the transform-based error-driven learning to get the rules for detecting prosodic phrase boundaries, and then used trees to organize the rules in the TTS system. For usin...

متن کامل

Automatic Prosody Generation in a Text-to-speech System for Hebrew

The paper presents the module for automatic prosody generation within a system for automatic synthesis of high-quality speech based on arbitrary text in Hebrew. The high quality of synthesis is due to the high accuracy of automatic prosody generation, enabling the introduction of elements of natural sentence prosody of Hebrew. Automatic morphological annotation of text is based on the applicati...

متن کامل

A Hierarchical Stochastic Model for Automatic Prediction of Prosodic Boundary Location

Prosodic phrase structure provides important information for the understanding and naturalness of synthetic speech, and a good model of prosodic phrases has applications in both speech synthesis and speech understanding. This work describes a statistical model of an embedded hierarchy of prosodic phrase structure, motivated by results in linguistic theory. Each level of the hierarchy is modeled...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997