A miniature Chinese TTS system based on tailored corpus
نویسندگان
چکیده
Miniature Text to Speech (TTS) systems are broadly applied to embedded system and speech chip, where limited resource requires the corpus to be relatively small and the computing complexity to be low. In general, speech synthesized by conventional miniature TTS systems lacks naturalness due to the limitation of corpus size. In this paper, a method of automatic building a small corpus from a large speech database is described. A new way of distance measurement among candidate instances is also proposed. Based on the tailored corpus, a miniature Chinese TTS system is built, which can produce speech with high naturalness.
منابع مشابه
Multilingual Speech Corpora for TTS System Development
In this paper, four speech corpora collected in the Speech Lab of NCTU in recent years are discussed. They include a Mandarin treebank speech corpus, a Min-Nan speech corpus, a Hakka speech corpus, and a Chinese-English mixed speech corpus. Currently, they are used separately to develop a corpus-based Mandarin TTS system, a Min-Nan TTS system, a Hakka TTS system, and a Chinese-English bilingual...
متن کاملXIMERA: a new TTS from ATR based on corpus-based technologies
This paper describes a new concatenative TTS system under development at ATR. The system, named XIMERA, is based on corpus-based technologies, as was the case for the preceding TTS systems from ATR, namely ν-talk and CHATR. The prominent features of XIMERA are (1) large corpora (a 110hours corpus of a Japanese male, a 60-hours corpus of a Japanese female, and a 20-hours corpus of a Chinese fema...
متن کاملConcatenative Mandarin Tts Accommodating Isolated English Words
An experiment to explore the method realizing a concatenative Chinese TTS accommodating isolated English words is presented. The experiment was based on an existing concatenative Mandarin TTS system, developed in Motorola China Research Center. The experimental system employs an English word synthesizer based on the concatenation of speech segments stored in an English corpus. The original Engl...
متن کاملOn unit analysis for Cantonese corpus-based TTS
This paper reports a study of unit analysis for concatenative TTS, which usually has an inventory of hundreds of thousand of voice units. It is known that the quality of synthesis units is especially critical to the quality of resulting corpus-based TTS system. This research focuses on the analysis of a Chinese Cantonese unit inventory, which has been built earlier for open vocabulary Chinese C...
متن کاملTowards a Chinese text-to-speech system with higher naturalness
This paper presents our research efforts on Chinese text-tospeech towards higher naturalness, the main results can be summarized as follows: 1. In the proposed TTS system the syllable-sized units were cut out from the real recorded speech, the synthetic speech was generated by concatenating these units back together. 2. The integration of units synthesized by rules with natural units was tested...
متن کامل