Automatically Creating a Diphone Set from a Speech Database
نویسندگان
چکیده
This paper presents a measure that scores various aspects of phone quality. The measure is designed to penalize phone instances with one or several characteristics that are not desirable in concatenation-based speech synthesis. Depending on the phone type, these aspects amongst others include spectrum, phase, fundamental frequency, duration, voicing and plosive quality. We applied this quality measure to select diphone sets from four different speech databases and demonstrate the quality of these diphone sets by means of synthesis examples. The quality of these examples showed that the proposed measure can be applied to select a high-quality diphone set from a speech database.
منابع مشابه
Creating German unit selection voices for the MARY TTS platform from the BITS corpora
The present paper reports on the creation of German unit selection voices from corpora which had been recorded and annotated previously in the BITS project. We describe the unit selection mechanism of our MARY TTS platform, as well as the tools for creating a synthesis voice from a speech corpus, and their application to the creation of German unit selection voices from the BITS corpora. Becaus...
متن کاملHalfphones: A Backoff Mechanism for Diphone Unit Selection Synthesis
Diphone Backoff mechanisms in text-to-speech provide a means of ensuring that synthesis of the text takes place, even if some of the diphones in the text are missing in the speech database. This paper describes an automatic method for synthetically creating missing diphones from halfphones that are in the speech database.
متن کاملAutomatic generation of speech synthesis units based on closed loop training
This paper proposes a new method for automatically generating speech synthesis units. A small set of synthesis units is selected from a large speech database by the proposed Closed-Loop Training method (CLT). Because CLT is based on the evaluation and minimization of the distortion caused by the synthesis process such as prosodic modi cation, the selected synthesis units are most suitable for s...
متن کاملA biphone constrained concatenation method for diphone synthesis
Diphone concatenation [1] has the advantages of simplicity and a relatively small database of speech when compared to other concatenative synthesis methods (e.g., [2]). However, diphone concatenation faces two notable problems. The first is coarticulation which extends beyond the scope of a single diphone and entails some degree of contextual mismatch for virtually any diphone in at least some ...
متن کاملExpressing vocal effort in con
A new diphone database with a full diphone set for each of three levels of vocal effort is presented. A theoretical motivation is given why this kind of database will be useful for emotional speech synthesis. Two hypotheses are verified in perception experiments: (I) The three diphone sets are perceived as belonging to the same speaker; (II) The vocal effort intended during database recordings ...
متن کامل