High-quality analysis/synthesis method based on temporal decomposition for speech modification
نویسندگان
چکیده
The challenge of speech modification is to flexibly modify the speech without degrading speech quality. The conventional methods are limited by their inability to flexibly control speech signals in time and frequency domains. This causes degradation of the quality of modified speech. This paper proposes a highquality analysis/synthesis method for speech modification. To control the temporal evolution, we use a speech analysis technique called temporal decomposition (TD), which decomposes a speech signal into event targets and event functions. The same event functions evaluated for the spectral parameters are also used to model the temporal evolution of the excitation parameters. The event functions describe the temporal evolution of the spectral and excitation parameters, and the event targets represent the “ideal” spectral parameters. To flexibly control speech signals in both time and frequency domains, we propose new methods to model the event functions and the event targets. The experimental results show that our proposed analysis/synthesis method produces high-quality synthesized speech, and allows the flexibility to modify speech signals.
منابع مشابه
Coding Speech at Very Low R and Temporal Deco
This paper presents a new method for speech coding at rates around 1.2 kbps based on STRAIGHT, a high quality speech analysis-synthesis method. For encoding spectral information, Modified Restricted Temporal Decomposition (MRTD) based vector quantization is used, where MRTD is a method of temporal decomposition for line spectral frequency parameters. Meanwhile, pitch and gain parameters are cod...
متن کاملPerceptual Evaluation of Quality Deterioration Owing to Prosody Modification
Our reasearch goal is to construct a Japanese TTS (Text-to-Speech) system that can output various kinds of prosody. Since such synthetic speech is useful for a practical use, many TTS systems have implemented global prosodic control processing. But fundamentally they're designed to output speech with standard pitch and speech rate. We discuss synthesis method for high quality speech with extrem...
متن کاملHigh quality speech synthesis using a small speech dataset
We propose an approach to synthesizing high-quality speech under the conditions of a small dataset. A robust method for solving this problem is vital for voice restoration (recreation of lost fragments of records based on available speech material of a well-known person, e.g. an actor). The proposed TTS system is a hybrid system which includes the advantages of both HMMand Unit Selection-based ...
متن کاملAperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
A new control paradigm of source signals for high quality speech synthesis is introduced to handle a variety of speech quality, based on timefrequency analyses by the use of an instantaneous frequency and group delay. The proposed signal representation consists of a frequency domain aperiodicity measure and a time domain energy concentration measure to represent source attributes, which supplem...
متن کاملWideband Harmonic Model: Alignment and Noise Modeling for High Quality Speech Synthesis
Speech sinusoidal modeling has been successfully applied to a broad range of speech analysis, synthesis and modification tasks. However, developing a high fidelity full band sinusoidal model that preserves its high quality on speech transformation still remains an open research problem. Such a system can be extremely useful for high quality speech synthesis. In this paper we present an enhanced...
متن کامل