High quality TTS voices within one day

نویسندگان

  • Didier Cadic
  • Christophe d'Alessandro
چکیده

State-of-the-art unit-selection text-to-speech systems currently produce very natural synthetic speech, at the price however of a costly and time-consuming voice creation process. We report here an extensive perceptual evaluation of several voice creation strategies, and conclude with a novel 1day process giving access to high quality TTS voices.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting the quality of text-to-speech systems from a large-scale feature set

We extract 1495 speech features from 2 subjectively evaluated text-to-speech (TTS) databases. These features are extracted from pitch, loudness, MFCCs, spectrals, formants, and intensity. The speech material is synthesized using up to 15 different TTS systems, some of them with up to 8 different voices. We develop quality predictors for TTS signals following two different approaches to handle t...

متن کامل

Corpus and Voices for Catalan Speech Synthesis

In this paper we describe the design and production of a Catalan database for building synthetic voices. Two speakers have recorded 10 hours of speech each one. The speaker selection and the corpus design aim to provide resources for high quality synthesis. In fact, as a side effect, in the speaker selection proccess we have produced 10 databases of 1 hour each one which allows producing medium...

متن کامل

Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech

The quality of text-to-speech (TTS) voices built from noisy speech is compromised. Enhancing the speech data before training has been shown to improve quality but voices built with clean speech are still preferred. In this paper we investigate two different approaches for speech enhancement to train TTS systems. In both approaches we train a recursive neural network (RNN) to map acoustic featur...

متن کامل

Explorer Investigating RNN - based speech enhancement methods for noise - robust Text - to - Speech

The quality of text-to-speech (TTS) voices built from noisy speech is compromised. Enhancing the speech data before training has been shown to improve quality but voices built with clean speech are still preferred. In this paper we investigate two different approaches for speech enhancement to train TTS systems. In both approaches we train a recursive neural network (RNN) to map acoustic featur...

متن کامل

Composite TTS voices

A new approach to synthetic voice generation and modification is described. One aspect of the approach is that no attempt is made to parametrize voices, unlike the commonly used Gaussian Mixture Model (GMM) paradigm and the newer eigenvoice techniques. Instead, a straightforward unit selection approach is adopted. A second aspect is that we systematically examine mixing units from different voi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010