A Neural Model of Speech Production
نویسندگان
چکیده
This paper describes the most recent version of the DIVA model, a neural network model of the brain computations underlying the acquisition and production of speech sounds. The model, which is implemented as a set of equations representing neural activity and synaptic strengths, is designed to account for the results of functional magnetic resonance imaging (fMRI) and electromagnetic midsagittal articulometry (EMMA) experiments concerning the production of speech. Computer simulations of the model have been performed to illustrate its ability to account for speaker-specific articulator movements in different phonetic contexts, as well as fMRI activations seen during normal and perturbed speech. The model is also used to generate predictions that guide new fMRI and EMMA experiments aimed at achieving a better understanding of the neural bases of speech. The results of these experiments are in turn used to further refine the model. Finally, the model can be used to investigate the effects of various types of neurological damage on speaking skills. INTRODUCTION: OVERVIEW OF THE DIVA MODEL Figure 1 provides an overview of the DIVA model (e.g., Guenther, 1994; Guenther et al., 1995; 1998; Perkell et al., 2000; Callan et al., 2000). The model consists of a neural network controller whose cells correspond to boxes and synaptic weights correspond to arrows in the figure. The neural network utilizes a babbling stage to learn the neural mappings (arrows) necessary for controlling an articulatory synthesizer (e.g., Maeda, 1990). The output of the model (labelled “To Muscles” in Figure 1) specifies the positions of the 7 articulators that determine the vocal tract shape in the articulatory synthesizer. Speech Sound Map (Premotor Cortex) Articulator Velocity and Position Maps (Motor Cortex) Auditory Error Map (Auditory Cortex) Somatosensory Error Map (Somatosensory Cortex) Auditory Goal Region Somatosensory Goal Region Somatosensory State Auditory State
منابع مشابه
شبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملSpeech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملPrediction of Egg Production Using Artificial Neural Network
Artificial neural networks (ANN) have shown to be a powerful tool for system modeling in a wide range of applications. The focus of this study is on neural network applications to data analysis in egg production. An ANN model with two hidden layers, trained with a back propagation algorithm, successfully learned the relationship between the input (age of hen) and output (egg production) variabl...
متن کاملMultidirectional mappings and the concept of a mental syllabary in a neural model of speech production
As a result from modeling cortical processes of self-organization occuring during speech acquisition, a comprehensive neural model of speech production has been developed by using self-organizing neural networks and feedforward neural networks. This model is capable of generating acoustic speech signals and sensory feedback signals by using a high quality 3-dimensional articulatory-acoustic spe...
متن کامل[Modeling developmental aspects of sensorimotor control of speech production].
BACKGROUND Detailed knowledge of the neurophysiology of speech acquisition is important for understanding the developmental aspects of speech perception and production and for understanding developmental disorders of speech perception and production. METHOD A computer implemented neural model of sensorimotor control of speech production was developed. The model is capable of demonstrating the...
متن کامل