Joint Target and Join Cost Weight Training for Unit Selection Synthesis
نویسندگان
چکیده
One of the key challenges of optimizing a unit selection voice is obtaining suitable target and join cost weights. In this paper we investigate several strategies to train these weights automatically. Two training algorithms are tested, which are based on an acoustic distance that approximates human perception: a modified version of the well-known linear regression training and an iterative algorithm that tries to minimize a selection error. Since a single, global set of weights might not result in selecting all the time the best sequence of units, we investigate whether using multiple weight sets could improve the synthesis quality.
منابع مشابه
Symbolic vs. acoustics-based style control for expressive unit selection
The present paper addresses the issue of flexibility in expressive unit selection speech synthesis by using different style selection techniques. We select units from a mixed-style unit selection database, using either forced style switching, no control, symbolic target cost, or acoustic target cost as a style selection criterion. We assess the effect of selection technique, feature weight and ...
متن کاملJoin Cost for Unit Selection Speech Synthesis
In unit-selection speech synthesis systems, synthetic speech is produced by concatenating speech units selected from a large database, or inventory, which contains many instances of each speech unit with varied prosodic and spectral characteristics. Hence, by selecting an appropriate sequence of units, it is possible to synthesize highly natural-sounding speech. The selection of the best unit s...
متن کاملPerfect Synthesis for All of the People All of the Time
The quality of speech synthesis has drastically improved over the last ten years. Or at least it appears that this is the case. We have moved from diphones to unit selection. However, although we can produce much more natural sounding examples we have also given up an certain amount of control over what can be synthesized. We have reached the stage where playing a few examples to a non-expert c...
متن کاملDiscriminative weight training for unit-selection based speech synthesis
Concatenative speech synthesis by selecting units from large database has become popular due to its high quality in synthesized speech. The units are selected by minimizing the combination of target and join costs for a given sentence. In this paper, we propose a new approach to train the weight parameters associated with the cost functions used for unit selection in concatenative speech synthe...
متن کاملThe VUB Blizzard Challenge 2009 Entry
In this paper we describe the voices we submitted to the 2009 Blizzard Challenge, a yearly challenge to evaluate auditory speech synthesis on common data. Since it is the second time we participate in this challenge, in this paper we focus on the changes we made to our unit selection-based system. The weighted sum of symbolic target costs has been replaced by a single statistical target cost; t...
متن کامل