Automatic Selection of Synthesis Units from a Large Speech Database

نویسندگان

  • Jau-Hung Chen
  • Chung-Hsien Wu
چکیده

In this paper, a novel method for the selection of synthesis unit is proposed. The monosyllables are adopted as the basic synthesis units. A set of high-quality synthesis units is selected from a large continuous speech database based on four procedures: pitch period detection and smoothing, speech unit filtering, unit selection, and manual examination. Two cost functions are proposed for obtaining the synthesis units, which minimize the interand intra-syllable distortion. The cost functions estimate the parameters including the prosodic features, the LSP frequencies, and types of syllable concatenation. Experimental results showed that a match rate of 48.9% was achieved. It indicates that about half of the "best" synthesis units can be automatically obtained. Also, a replacement rate of 4.8% was obtained.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Unit selection in a concatenative speech synthesis system using a large speech database

One approach to the generation of natural-sounding synthesized speech waveforms is to select and concatenate units from a large speech database. Units (in the current work, phonemes) are selected to produce a natural realisation of a target phoneme sequence predicted from text which is annotated with prosodic and phonetic context information. We propose that the units in a synthesis database ca...

متن کامل

Automatically clustering similar units for unit selection in speech synthesis

This paper describes a new method for synthesizing speech by concatenating sub-word units from a database of labelled speech. A large unit inventory is created by automatically clustering units of the same phone class based on their phonetic and prosodic context. The appropriate cluster is then selected for a target unit offering a small set of candidate units. An optimal path is found through ...

متن کامل

Join Cost for Unit Selection Speech Synthesis

In unit-selection speech synthesis systems, synthetic speech is produced by concatenating speech units selected from a large database, or inventory, which contains many instances of each speech unit with varied prosodic and spectral characteristics. Hence, by selecting an appropriate sequence of units, it is possible to synthesize highly natural-sounding speech. The selection of the best unit s...

متن کامل

Segment selection in the L&h Realspeak laboratory TTS system

The L&H RealSpeak Laboratory TTS (RSLab) system is a corpus based speech synthesis system comprising components that deal with linguistic processing, prosody prediction, segment selection, concatenation and modification. In this paper we focus on the segment selection process. During segment selection, the units in a large database of speech are scored with a cost according to their prosodic/ph...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999