Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems
نویسندگان
چکیده
In this paper, a data pruning approach is presented for building acoustic unit inventory for syllable-based concatenative embeddable Chinese TTS system. A 3-portion segmentation of a syllable is proposed based on the nature of voiced/unvoiced structure of Chinese syllable. Individual factorial acoustic measurement of syllable is used to calculate the penalty of perceptual unsatisfactory for concatenation. With respect to the calculated penalties, bad syllables are removed from a cluster. The best syllable of each pruned cluster is selected with a compromised acoustic measurement. The evaluation and application result shows that the method is promising particularly to generate acoustic unit database for small footprint concatenative Chinese (Cantonese and Mandarin) TTS systems.
منابع مشابه
Data pruning using confidence measures for concatenative synthesis system built using automatically transcribed audio
Today, we can record and store large amounts of single speaker audio data, and also download it from the web. Generally, these data are prosodically rich and can therefore act as excellent candidates for building concatenative text-to-speech (TTS) systems. But transcritpions for these audio data are often not available and automatic transcriptions are error prone. In addition, these audio data ...
متن کاملPerceptually based automatic prosody labeling and prosodically enriched unit selection improve concatenative text-to-speech synthesis
Prosody is an important factor in the quality of text-tospeech (TTS) synthesis. Typically, acoustic parameters such as f0 and duration are the only variables related to prosody that are used to determine unit selection. Our study explored adding the explicit use of linguistically and perceptually motivated prosodic categories in unit selection-based TTS. One of our goals was to automate the pro...
متن کاملAn embedded and concatenative approach to TTS of multiple languages
In thi and appro for E (Es.), efficie archit embe can b comm select text p the la are u letterspeec etc., a This paper presents an embedded and concatenative approach to multilingual text-to-speech system (ECMTTS). Under a uniform architecture, the TTS modules are separated into language dependent and independent ones. A specifically defined super phonetic symbol set enables to use uniform spee...
متن کاملOn unit analysis for Cantonese corpus-based TTS
This paper reports a study of unit analysis for concatenative TTS, which usually has an inventory of hundreds of thousand of voice units. It is known that the quality of synthesis units is especially critical to the quality of resulting corpus-based TTS system. This research focuses on the analysis of a Chinese Cantonese unit inventory, which has been built earlier for open vocabulary Chinese C...
متن کاملA Corpus-Based Concatenative Speech Synthesis System for Turkish
Speech synthesis is the process of converting written text into machine-generated synthetic speech. Concatenative speech synthesis systems form utterances by concatenating pre-recorded speech units. Corpus-based methods use a large inventory to select the units to be concatenated. In this paper, we design and develop an intelligible and natural sounding corpus-based concatenative speech synthes...
متن کامل