Perceptually weighted linear transformations for voice conversion

نویسندگان

  • Hui Ye
  • Steve J. Young
چکیده

Voice conversion is a technique for modifying a source speaker’s speech to sound as if it was spoken by a target speaker. A popular approach to voice conversion is to apply a linear transformation to the spectral envelope. However, conventional parameter estimation based on least square error optimization does not necessarily lead to the best perceptual result. In this paper, a perceptually weighted linear transformation is presented which is based on the minimization of the perceptual spectral distance between the voices of the source and target speakers. The paper describes the new conversion algorithm and presents a preliminary evaluation of the performance of the method based on objective and subjective tests.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study on Bag of Gaussian Model with Application to Voice Conversion

The GMM based mapping techniques proved to be an efficient method to find nonlinear regression function between two spaces, and found success in voice conversion. In these methods, a linear transformation is estimated for each Guassian component, and the final conversion function is a weighted summation of all linear transformations. These linear transformations fit well for the samples near to...

متن کامل

A comparison of voice conversion methods for transforming voice quality in emotional speech synthesis

This paper presents a comparison of methods for transforming voice quality in neutral synthetic speech to match cheerful, aggressive, and depressed expressive styles. Neutral speech is generated using the unit selection system in the MARY TTS platform and a large neutral database in German. The output is modified using voice conversion techniques to match the target expressive styles, the focus...

متن کامل

The linear transformation of LF glottal waveforms for voice conversion

Most Voice Conversion (VC) systems exploit source-filter decomposition based on linear prediction (LP) to transform spectral envelopes, incurring as a result various issues related to the oversimplification of the LP voice source model. Whilst residual prediction methods can mitigate this problem, they cannot be used to modify voice source quality. In this paper, a system which employs linear t...

متن کامل

Spectral voice conversion based on unsupervised clustering of acoustic space

Voice conversion systems aim at modifying a source speaker’s speech so that it is perceived as if a target speaker had spoken it. Applying voice conversion techniques to a concatenative text-to-speech synthesizer allows for the personification of such systems, so that additional voices from a single source-speaker database can be produced quickly and automatically. This paper presents a new alg...

متن کامل

Analysis of speaker clustering strategies for HMM-based speech synthesis

This paper describes a method for speaker clustering, with the application of building average voice models for speakeradaptive HMM-based speech synthesis that are a good basis for adapting to specific target speakers. Our main hypothesis is that using perceptually similar speakers to build the average voice model will be better than use unselected speakers, even if the amount of data available...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003