Enhanced shape-invariant pitch and time-scale modification for concatenative speech synthesis

نویسندگان

  • Mat P. Pollard
  • Barry M. G. Cheetham
  • Colin C. Goodyear
  • Mike D. Edgington
  • A. Lowry
چکیده

To preserve shape-invariance when pitch or time-scale modifying sinusoidally modelled voiced speech, the phases of the sinusoids used to model the glottal excitation are made to add coherently at estimated excitation points. Previous methods achieve this by estimating excitation phases at synthesis frame boundaries, disregarding the frequency modulation that may occur between the frame boundary and the nearest modified excitation point. This approximation can produce a significant mis-alignment of the excitation phases, leading to distortion of the temporal structure of the synthetic speech. In this paper, a shape-invariant technique is proposed which aligns the excitation phases at excitation points, whilst allowing for variations in the frequency of the sinusoidal components.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shape invariant time-scale modification of speech using a harmonic model

A new and simple approach to shape invariant timescale modi cation of speech is presented. The method, based upon a harmonic coding of each speech frame, operates entirely within the original sinusoidal model [3] and makes no use of \pitch-pulse onset times" used by conventional algorithms. Instead, phase coherence, and thus shape invariance, are ensured by exploiting the harmonic relation exis...

متن کامل

Source-filter models for time-scale pitch-scale modification of speech

This paper presents two time-scale pitch-scale modification techniques to be used in speech synthesis systems. They have been applied to Microsoft’s Whistler system, which is based on concatenative synthesis. Both methods are based on a sourcefilter model, one of them using LPC parameters and the other one using cepstral parameters. The proposed methods achieve high quality prosody modification...

متن کامل

A hybrid method oriented to concatenative text-to-speech synthesis

In this paper we present a speech synthesis method for diphonebased text-to-speech systems. Its main goal is to achieve prosodic modifications that result in more natural-sounding synthetic speech. This improvement is especially useful for emotional speech synthesis, which requires high-quality prosodic modification. We present a hybrid method based on TD-PSOLA and the harmonic plus noise model...

متن کامل

Maximum-likelihood dynamic intonation model for concatenative text-to-speech system

In this work we present a Maximum Likelihood (ML) joint pitch curve modeling, inspired by HMM TTS synthesis concept. This model provides an optimal solution for the coarse target intonation curve (3 points per syllable) and incorporates both static and dynamic pitch values for better utterance intonation modeling. The coarse intonation curve may be optionally combined with the original pitch ex...

متن کامل

Development of Concatenative Syllable based Text to Speech Synthesis System for Tamil

This paper addresses the problem of improving the intelligibility of the synthesized speech in Tamil TTS synthesis system. The human speech is artificially generated by Speech synthesis. The normal language text will be automatically converted into speech using Text-to-speech (TTS) system. This paper deals with a corpus-driven Tamil TTS system based on the concatenative synthesis approach. Conc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996