Maximum likelihood unit selection for corpus-based speech synthesis
نویسندگان
چکیده
Corpus-based speech synthesis systems deliver a considerable synthesis quality since the unit selection approaches have been optimized in the last decade. Unit selection attempts to find the best combination of speech unit sequences in an inventory so that the perceptual differences between expected (natural) and synthesized signals are as low as possible. However, mismatches and distortions are still possible in concatenative speech synthesis and they are normally perceptible in the synthesized waveform. Therefore, unit selection strategies and parameter tuning are still important issues in the improvement of speech synthesis. We present a novel concept to increase the efficiency of the exhaustive speech unit search within the inventory via a unit selection model. This model bases its operation on a mapping analysis of the concatenation sub-costs, a Bayes optimal classification (BOC), and a Maximum likelihood selection (MLS). The principle advantage of the proposed unit selection method is that it does not require an exhaustive training to set up weighted coefficients for target and concatenation subcosts. It provides an alternative for unit selection but requires further optimization, e. g. by integrating target cost mapping.
منابع مشابه
Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملA corpus-based Chinese speech synthesis with contextual dependent unit selection
This paper describes the realization of a corpus-based Chinese speech synthesis system, including the corpus design and unit selection procedure. The system selects the synthesis unit according to context similarity between target unit and candidate unit. Neither prosody parameter prediction nor prosody feature modification is needed. The informal test shows that the synthesized speech is quite...
متن کاملAutomatic prominence annotation of a German speech synthesis corpus: towards prominence-based prosody generation for unit selection synthesis
This paper describes work directed towards the development of a syllable prominence-based prosody generation functionality for a German unit selection speech synthesis system. A general concept for syllable prominence-based prosody generation in unit selection synthesis is proposed. As a first step towards its implementation, an automated syllable prominence annotation procedure based on acoust...
متن کاملUnit Selection Algorithm Using Bi-grams Model For Corpus-Based Speech Synthesis
In this paper, we present a novel statistical approach to corpus-based speech synthesis. Classically, phonetic information is defined and considered as acoustic reference to be respected. In this way, many studies were elaborated for acoustical unit classification. This type of classification allows separating units according to their symbolic characteristics. Indeed, target cost and concatenat...
متن کاملDesigning a Speech Corpus for Estonian Unit Selection Synthesis
The article reports the development of a speech corpus for Estonian text-to-speech synthesis based on unit selection. Introduced are the principles of the corpus as well as the procedure of its creation, from text compilation to corpus analysis and text recording. Also described are the choices made in the process of producing a text of 400 sentences, the relevant lexical and morphological pref...
متن کامل