Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis
نویسندگان
چکیده
This paper describes a method for determining the vocal tract spectrum from articulatory movements using a Gaussian Mixture Model (GMM) to synthesize speech with articulatory information. The GMM on joint probability density of articulatory parameters and acoustic spectral parameters is trained using a parallel acousticarticulatory speech database. We evaluate the performance of the GMM-based mapping by a spectral distortion measure. Experimental results demonstrate that the distortion can be reduced by using not only the articulatory parameters of the vocal tract but also power and voicing information as input features. Moreover, in order to determine the best mapping, we apply maximum likelihood estimation (MLE) to the GMM-based mapping method. Experimental results show that MLE using both static and dynamic features can improve the mapping accuracy compared with the conventional GMM-based mapping.
منابع مشابه
Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
In this paper, we describe a statistical approach to both an articulatory-to-acoustic mapping and an acoustic-to-articulatory inversion mapping without using phonetic information. The joint probability density of an articulatory parameter and an acoustic parameter is modeled using a Gaussian mixture model (GMM) based on a parallel acoustic-articulatory speech database. We apply the GMM-based ma...
متن کاملAn Analysis-by-Synthesis Approach to Vocal Tract Modeling for Robust Speech Recognition Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Electrical and Computer Engineering
In this thesis we present a novel approach to speech recognition that incorporates knowledge of the speech production process. The major contribution is the development of a speech recognition system that is motivated by the physical generative process of speech, rather than the purely statistical approach that has been the basis for virtually all current recognizers. We follow an analysis-by-s...
متن کاملArticulatory controllable speech modification based on Gaussian mixture models with direct waveform modification using spectrum differential
In our previous work, we have developed a speech modification system capable of manipulating unobserved articulatory movements by sequentially performing speech-to-articulatory inversion mapping and articulatory-to-speech production mapping based on a Gaussian mixture model (GMM)-based statistical feature mapping technique. One of the biggest issues to be addressed in this system is quality deg...
متن کاملArticulatory controllable speech modification based on statistical feature mapping with Gaussian mixture models
This paper presents a novel speech modification method capable of controlling unobservable articulatory parameters based on a statistical feature mapping technique with Gaussian Mixture Models (GMMs). In previous work [1], the GMM-based statistical feature mapping was successfully applied to acousticto-articulatory inversion mapping and articulatory-to-acoustic production mapping separately. In...
متن کاملComparative articulatory modelling of the tongue in speech and feeding
Purpose: Two of the major functions of the human vocal tract are feeding and speaking. As ontogenetically and phylogenetically feeding tasks precede speaking tasks, it has been hypothesised that the skilled movements of the orofacial articulators specific to speech may have evolved from feeding functions. Our objective is to bring evidence to support this hypothesis. Method: Vocal tract articul...
متن کامل