Multi-tier Non-uniform Unit Selection for Corpus-based Speech Synthesis
نویسندگان
چکیده
In this paper, a corpus-based speech synthesis system KB2006 was developed using the speech database provided by Blizzard Challenge 2006. We proposed a novel unit selection method called multi-tier non-uniform unit selection in our corpus-base speech synthesis system. Non-uniform unit (NUU) in our system was defined as a unit sequences that contains one or more joint phoneme units. By using CART algorithm, NUUs with the same phoneme sequence in the inventory were clustered into different classes according to their prosody and acoustic difference. In the unit selection stage, a multi-tier NUUs selection algorithm was adopted by treating different NUUs with several criterions. With the discrimination, proper candidate units that close to the target unit can be selected for speech concatenation.
منابع مشابه
Selecting non-uniform units from a very large corpus for concatenative speech synthesizer
This paper proposes a two-module TTS structure, which bypasses the prosody model that predicts numerical prosodic parameters for synthetic speech. Instead, many instances of each basic unit from a large speech corpus are classified into categories by a CART, in which the expectation of the weighted sum of square regression error of prosodic features is used as splitting criterion. Better prosod...
متن کاملA concatenative Mandarin TTS system without prosody model and prosody modification
This paper proposes a two-step solution for generating natural prosody in TTS, in which no prosody prediction and modification are needed. A large phonetically and prosodically enriched speech corpus has been collected as the unit pool for the synthesizer. A multi-tier non-uniform unit selection scheme is developed to pick up the most suitable segments for concatenation from the unit pool. Fina...
متن کاملHierarchical non-uniform unit selection based on prosodic structure
In speech synthesis systems based on wave concatenation, using longer units can generate more natural synthetic speech. In order to improve the usage of longer units in the corpus, this paper proposed a hierarchical non-uniform unit selection framework. Each layer included in the framework is an independent searching procedure which searches for different sized units and adopts suitable natural...
متن کاملCombining non-uniform unit selection with diphone based synthesis
This paper describes the unit selection algorithm of a speech synthesis system, which selects the k-best paths over units from a relational unit database. The algorithm uses words and diphones as basic unit types. It is part of a customisable textto-speech system designed for generating new prompts using a recorded speech corpus, with the option that the user can interactively optimise the resu...
متن کاملCombining Non-uniform Unit Sele Synthesis
This paper describes the unit selection algorithm of a speech synthesis system, which selects the k-best paths over units from a relational unit database. The algorithm uses words and diphones as basic unit types. It is part of a customisable textto-speech system designed for generating new prompts using a recorded speech corpus, with the option that the user can interactively optimise the resu...
متن کامل