voice conversion

First Steps Towards New Czech Voice Conversion System

2006

Zdenek Hanzlícek Jindrich Matousek

In this paper we deal with initial experiments on creating a new Czech voice conversion system. Voice conversion (VC) is a process which modifies the speech signal produced by one (source) speaker so that it sounds like another (target) speaker. Using VC technique a new voice for speech synthesizer can be prepared with no need to record a huge amount of new speech data. The transformation is de...

متن کامل

Nonaudible murmur enhancement based on statistical voice conversion and noise suppression with external noise monitoring

2016

Yusuke Tajiri Tomoki Toda

This paper presents a method for making nonaudible murmur (NAM) enhancement based on statistical voice conversion (VC) robust against external noise. NAM, which is an extremely soft whispered voice, is a promising medium for silent speech communication thanks to its faint volume. Although such a soft voice can still be detected with a special body-conductive microphone, its quality significantl...

متن کامل

Adaptive Training for Voice Conversion Based on Eigenvoices

Journal: :IEICE Transactions 2010

Yamato Ohtani Tomoki Toda Hiroshi Saruwatari Kiyohiro Shikano

In this paper, we describe a novel model training method for one-to-many eigenvoice conversion (EVC). One-to-many EVC is a technique for converting a specific source speaker’s voice into an arbitrary target speaker’s voice. An eigenvoice Gaussian mixture model (EVGMM) is trained in advance using multiple parallel data sets consisting of utterance-pairs of the source speaker and many pre-stored ...

متن کامل

A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation

Journal: :IEICE Transactions 2014

Kou Tanaka Tomoki Toda Graham Neubig Sakriani Sakti Satoshi Nakamura

This paper presents an electrolaryngeal (EL) speech enhancement method capable of significantly improving naturalness of EL speech while causing no degradation in its intelligibility. An electrolarynx is an external device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient laryngectomees can produce quite intelligible EL speech, it s...

متن کامل

Voice conversion using precise speech alignment based on spectral property and eigen-codeword distribution

2010

Yi-Chin Huang Chung-Hsien Wu Chung-Han Lee Yu-Ting Chao

While voice conversion methods have been popularly applied to convert the speech signals uttered by a source speaker to a target speaker, frame-based voice conversion generally suffers from incorrect alignment using only spectral distance and therefore generate improper conversion results. In a parallel phone sequence, the alignment using minimum spectral distance between frame-based feature ve...

متن کامل

Novel method for data clustering and mode selection with application in voice conversion

2006

Jani Nurminen Jilei Tian Victor Popa

Since the statistical properties of speech signals are variable and depend heavily on the content, it is hard to design speech processing techniques that would perform well on all inputs. For example, in voice conversion, where the aim is to transform the speech uttered by a source speaker to sound as if it was spoken by a target speaker, different types of interspeaker relationships can be fou...

متن کامل

Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-Embedded Non-Negative Matrix Factorization

2016

Ryo Aihara Tetsuya Takiguchi Yasuo Ariki

This paper proposes a discriminative learning method for Nonnegative Matrix Factorization (NMF)-based Voice Conversion (VC). NMF-based VC has been researched because of the natural-sounding voice it produces compared with conventional Gaussian Mixture Model (GMM)-based VC. In conventional NMF-based VC, parallel exemplars are used as the dictionary; therefore, dictionary learning is not adopted....

متن کامل

Exemplar-based unit selection for voice conversion utilizing temporal information

2013

Zhizheng Wu Tuomas Virtanen Tomi Kinnunen Chng Eng Siong Haizhou Li

Although temporal information of speech has been shown to play an important role in perception, most of the voice conversion approaches assume the speech frames are independent of each other, thereby ignoring the temporal information. In this study, we improve conventional unit selection approach by using exemplars which span multiple frames as base units, and also take temporal information con...

متن کامل

Frame alignment method for cross-lingual voice conversion

2007

Daniel Erro Asunción Moreno

Most of the existing voice conversion methods calculate the optimal transformation function from a given set of paired acoustic vectors of the source and target speakers. The alignment of the phonetically equivalent source and target frames is problematic when the training corpus available is not parallel, although this is the most realistic situation. The alignment task is even more difficult ...

متن کامل

Quality Improvement of Voice Conversion Systems Based on Trellis Structured Vector Quantization

2011

Mahdi Eslami Hamid Sheikhzadeh Abolghasem Sayadiyan

Common voice conversion systems employ a spectral / time domain mapping to convert speech from one speaker to another. The speech quality of conversion methods does not sound natural because the spectral / time domain patterns of two speakers’ speech do not match completely. In this paper we propose a method that uses inter-frame (dynamic) characteristics in addition to intra-frame characterist...

متن کامل