DFW-based spectral smoothing for concatenative speech synthesis
نویسنده
چکیده
This paper proposes and evaluates a new spectral smoothing technique whose performance is comparable with LSP interpolation in terms of Euclidean spectral distance measurements but whose interpolated formant trajectories are more reasonable from a phonetic point of view. The approach firstly estimates derivative logarithmic magnitude spectra from both the source and the target frame represented by autoregressive filter coefficients. Then, Dynamic Programming yields the best alignment between these two spectral representations. Smoothed frequency responses are achieved by weighted linear interpolation between the corresponding source and target spectral lines whose alignment was found by DP backtracking. Finally, the spectrum is converted to autoregressive filter coefficients with the intermediate stage of autocorrelation coefficients.
منابع مشابه
Spectral smoothing for concatenative speech synthesis
This paper addresses the topic of performing e ective concatenative speech synthesis with a limited database by proposing methods to smooth the transitions between speech segments. The objective is to produce naturalsounding speech via segment concatenation when formants and other spectral features do not align properly. We propose several methods for adjusting the spectra between waveform segm...
متن کاملExploiting improved parameter smoothing within a hybrid concatenative/LPC speech synthesizer
We depict the interpolation strategies for the concatenation of inventory demisyllables in our hybrid concatenative/LPC speech synthesizer. Inventory elements for vowels and nasals are cut in the steady state of the phoneme. Concatenating elements in the synthesis stage requires smoothing of spectral content and energy to avoid annoying discontinuities in these parameters, which is of vital imp...
متن کاملA comparison of spectral smoothing methods for segment concatenation based speech synthesis
There are many scenarios in both speech synthesis and coding in which adjacent time-frames of speech are spectrally discontinuous. This paper addresses the topic of improving concatenative speech synthesis with a limited database by proposing methods to smooth, adjust, or interpolate the spectral transitions between speech segments. The objective is to produce natural-sounding speech via segmen...
متن کاملSpectral Envelope Transformation Using DFW and Amplitude Scaling for Voice Conversion with Parallel or Nonparallel Corpora
Dynamic Frequency Warping (DFW) offers an appealing alternative to GMM-based voice conversion, which suffers from ”over-smoothing” that hinders speech quality. However, to adjust spectral power after DFW, previous work returns to GMMtransformation. This paper proposes a more effective DFWwith amplitude scaling (DFWA) that functions on the acoustic class level and is independent of GMM-transform...
متن کاملEstimation of Spectral Mismatch for Joint Cost Evaluation in Marathi TTS
Among different methods of speech synthesis, Concatenative Speech Synthesis is widely used due to its naturalness and less signal processing requirement. But concatenative TTS has problems like requirement of large database and resulting spectral mismatch in output speech. In concatenative TTS position of syllable plays very important role while carrying out segmentation. If proper position syl...
متن کامل