Predicting unseen articulations from multi-speaker articulatory models
نویسندگان
چکیده
In order to study inter-speaker variability, this work aims to assess the generalization capabilities of data-based multi-speaker articulatory models. We use various three-mode factor analysis techniques to model the variations of midsagittal vocal tract contours obtained from MRI images for three French speakers articulating 73 vowels and consonants. Articulations of a given speaker for phonemes not present in the training set are then predicted by inversion of the models from measurements of these phonemes articulated by the other subjects. On the average, the prediction RMSE was 5.25 mm for tongue contours, and 3.3 mm for 2D midsagittal vocal tract distances. Besides, this study has established a methodology to determine the optimal number of factors for such models.
منابع مشابه
A Speaker Adaptive DNN Training Approach for Speaker-Independent Acoustic Inversion
We address the speaker-independent acoustic inversion (AI) problem, also referred to as acoustic-to-articulatory mapping. The scarce availability of multi-speaker articulatory data makes it difficult to learn a mapping which generalizes from a limited number of training speakers and reliably reconstructs the articulatory movements of unseen speakers. In this paper, we propose a Multi-task Learn...
متن کاملPrediction of the Articulatory Movements of Unseen Phonemes of a Speaker Using the Speech Structure of Another Speaker
In this paper, we propose a method to predict the articulatory movements of phonemes that are difficult for a speaker to pronounce correctly because those phonemes are not seen in the native language of that speaker. When one wants to predict the articulatory movements of those unseen phonemes, since he/she has difficulty to generate those sounds, the conventional acoustic-to-articulatory mappi...
متن کاملFrom Acoustics to Articulation
The focus of this thesis is the relationship between the articulation of speech and the acoustics of produced speech. There are several problems that are encountered in understanding this relationship, given the non-linearity, variance and non-uniqueness in the mapping, as well as the differences that exist in the size and shape of the articulators, and consequently the acoustics, for different...
متن کاملGenerating Gestural Scores from Acoustics Through a Sparse Anchor-Based Representation of Speech
We present a procedure for generating gestural scores from speech acoustics. The procedure is based on our recent SABR (sparse, anchor-based representation) algorithm, which models the speech signal as a linear combination of acoustic anchors. We present modifications to SABR that encourage temporal smoothness by restricting the number of anchors that can be active over an analysis window. We p...
متن کاملComparative articulatory modelling of the tongue in speech and feeding
Purpose: Two of the major functions of the human vocal tract are feeding and speaking. As ontogenetically and phylogenetically feeding tasks precede speaking tasks, it has been hypothesised that the skilled movements of the orofacial articulators specific to speech may have evolved from feeding functions. Our objective is to bring evidence to support this hypothesis. Method: Vocal tract articul...
متن کامل