Uniform concatenative excitation model for synthesising speech without voiced/unvoiced classification

نویسنده

João P. Cabral

چکیده

In general, speech synthesis using the source-filter model of speech production requires the classification of speech into two classes (voiced and unvoiced) which is prone to errors. For voiced speech, the input of the synthesis filter is an approximately periodic excitation, whereas it is a noise signal for unvoiced. This paper proposes an excitation model which can be used to synthesise both voiced and unvoiced speech, thus overcoming the problem of degradation in speech quality caused by those classification errors. Basically this model consists of representing two contiguous segments of the residual signal pitchsynchronously. The first segment is represented by the original residual in a fraction of the period around the pitch-mark (obtained using an epoch detector), in order to capture the most important aspects of the residual during voiced speech. Instead, the remaining part of the period is modelled by a set of parameters of the amplitude envelope of the residual waveform and its energy. The technique for synthesising the excitation combines these shaping parameters with a novel method for regeneration of the residual waveform and a method to mix a periodic signal with noise based on the Harmonic plus Noise model. Besides producing high-quality speech, this technique is computationally fast.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Low Resource TTS Synthesis Based on Cepstral Filter with Phase Randomized Excitation

In this paper we present the acoustic synthesis of a low resource Text-To-Speech (TTS) system based on a 7th order cepstral filter. The excitation signal is designed in frequency domain by a two parameter model. This model is able to generate the excitation signal for both, voiced and unvoiced segments. The sets of filter coefficients represent the speech units and are stored in a compressed fo...

متن کامل

A Variable Rate Speech Codec Using Vus Classification

Voiced speech is highly correlated and must be reconstructed accurately in order to sound correct. Unvoiced speech on the other hand is noise like in nature. It can be approximated by white noise coloured by the vocal tract filter. Because of this lack of structure in unvoiced speech sounds, the excitation signal does not have to reproduce the speech signal as accurately as for voiced sounds. T...

متن کامل

Least relative entropy for voiced/unvoiced speech classification

The aim of this work is to develop ajlexible and eficient approach to the classifcation of the ratio of voiced to unvoiced excitation sources in continuous speech. To achieve this aim we adopt a probabilistic neural network approach. This is accomplished by designing a multi layer perceptron classifer trained by steepest descent minimization of the Least Relative Entropy W) cost function. By us...

متن کامل

Segregation of unvoiced speech from nonspeech interference.

Monaural speech segregation has proven to be extremely challenging. While efforts in computational auditory scene analysis have led to considerable progress in voiced speech segregation, little attention has been given to unvoiced speech, which lacks harmonic structure and has weaker energy, hence more susceptible to interference. This study proposes a new approach to the problem of segregating...

متن کامل

Improved training of excitation for HMM-based parametric speech synthesis

This paper presents an improved method of training for the unvoiced filter that comprises an excitation model, within the framework of parametric speech synthesis based on hidden Markov models. The conventional approach calculates the unvoiced filter response from the differential signal of the residual and voiced excitation estimate. The differential signal, however, includes the error generat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Uniform concatenative excitation model for synthesising speech without voiced/unvoiced classification

نویسنده

چکیده

منابع مشابه

Low Resource TTS Synthesis Based on Cepstral Filter with Phase Randomized Excitation

A Variable Rate Speech Codec Using Vus Classification

Least relative entropy for voiced/unvoiced speech classification

Segregation of unvoiced speech from nonspeech interference.

Improved training of excitation for HMM-based parametric speech synthesis

عنوان ژورنال:

اشتراک گذاری