Diphone collection and synthesis
نویسندگان
چکیده
In this paper, we describe the design and collection of corpora for diphone synthesis, the voice building process, and our experience in the creation of a new, publically available database of ten diphone sets of one American English speaker for the Festival Speech Synthesis System [3], using the FestVox document and tools [1]. In support of our goal to make the tools and techniques available for anyone to build their own synthetic voices, we have generalized and streamlined the tasks involved from what were once arcane anecdotes, half-written one-off scripts, and partial descriptions, to detailed, complete instructions that others have followed with good results.
منابع مشابه
Implementation and evaluation of a text-to-speech synthesis system for turkish
In this paper, a diphone based Text-to-Speech (TTS) system for the Turkish language is presented. Turkish is the official language of Turkey, where it is the native language of 70 million people and it is also widely spoken in Asia (Azerbaidjain, Uzbekhstan, Kazakhstan, Kirgizhstan and Iran), Cyprus and the Balkans. The research has been done through a visiting internship at CSLR (the Center fo...
متن کاملA biphone constrained concatenation method for diphone synthesis
Diphone concatenation [1] has the advantages of simplicity and a relatively small database of speech when compared to other concatenative synthesis methods (e.g., [2]). However, diphone concatenation faces two notable problems. The first is coarticulation which extends beyond the scope of a single diphone and entails some degree of contextual mismatch for virtually any diphone in at least some ...
متن کاملSynthesis and Control of Synthesis Using a Generalized Diphone Method
Generalized Diphone Control is a powerful means of building a musical phrase from dictionaries of analysed sound units by building sequences of units and concatenating and articulating them. ~rough a graphical user interface on Macintosh, the Diphone 2.0 software provides analysis, control and synthesis according to various models, such as the Sinusoidal Additive model and the Chant model. A la...
متن کاملModel based analysis of a diphone database for improved unit concatenation
One crucial point of concatenation approaches using diphones is to handle the discontinuities between the concatenated units. This problem is treated by a suitable analysis of the diphones for a parametric synthesis. The model of the parametric synthesis is the lossy tube model, which is an extension of the standard lattice filter considering frequency dependent vocal tract losses. The paramete...
متن کاملModeling and Synthesizing Emotional Speech for Catalan Text-to-Speech Synthesis
This paper describes an initial approach to emotional speech synthesis in Catalan based on a diphone concatenation TTS system. The main goal of this work is to develop a simple prosodic model for expressive synthesis. This model is obtained from an emotional speech collection artificially generated by means of a copy-prosody experiment. After validating the emotional content of this collection,...
متن کامل