A novel technique for voice conversion based on style and content decomposition with bilinear models

نویسندگان

  • Victor Popa
  • Jani Nurminen
  • Moncef Gabbouj
چکیده

This paper presents a novel technique for voice conversion by solving a two-factor task using bilinear models. The spectral content of the speech represented as line spectral frequencies is separated into so-called style and content parameterizations using a framework proposed in [1]. This formulation of the voice conversion problem in terms of style and content offers a flexible representation of factor interactions and facilitates the use of efficient training algorithms based on singular value decomposition and expectation maximization. Promising results in a comparison with the traditional Gaussian mixture model based method indicate increased robustness with small training sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study of Bilinear Models in Voice Conversion

This paper presents a voice conversion technique based on bilinear models and introduces the concept of contextual modeling. The bilinear approach reformulates the spectral envelope representation from line spectral frequencies feature to a two-factor parameterization corresponding to speaker identity and phonetic information, the so-called style and content factors. This decomposition offers a...

متن کامل

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

Separating Style and Content with Bilinear Models

Perceptual systems routinely separate "content" from "style," classifying familiar words spoken in an unfamiliar accent, identifying a font or handwriting style across letters, or recognizing a familiar face or object seen under unfamiliar viewing conditions. Yet a general and tractable computational model of this ability to untangle the underlying factors of perceptual observations remains elu...

متن کامل

UW CSE Technical Report 03-06-01 Probabilistic Bilinear Models for Appearance-Based Vision

We present a probabilistic approach to learning object representations based on the “content and style” bilinear generative model of Tenenbaum and Freeman. In contrast to their earlier SVD-based approach, our approach models images using particle filters. We maintain separate particle filters to represent the content and style spaces, allowing us to define arbitrary weighting functions over the...

متن کامل

Probabilistic Bilinear Models for Appearance-Based Vision

We present a probabilistic approach to learning object representations based on the “content and style” bilinear generative model of Tenenbaum and Freeman. In contrast to their earlier SVD-based approach, our approach models images using particle filters. We maintain separate particle filters to represent the content and style spaces, allowing us to define arbitrary weighting functions over the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009