A Neural Model of Speech Production

نویسندگان

Frank H. Guenther

Satrajit S. Ghosh

Alfonso Nieto-Castanon

چکیده

This paper describes the most recent version of the DIVA model, a neural network model of the brain computations underlying the acquisition and production of speech sounds. The model, which is implemented as a set of equations representing neural activity and synaptic strengths, is designed to account for the results of functional magnetic resonance imaging (fMRI) and electromagnetic midsagittal articulometry (EMMA) experiments concerning the production of speech. Computer simulations of the model have been performed to illustrate its ability to account for speaker-specific articulator movements in different phonetic contexts, as well as fMRI activations seen during normal and perturbed speech. The model is also used to generate predictions that guide new fMRI and EMMA experiments aimed at achieving a better understanding of the neural bases of speech. The results of these experiments are in turn used to further refine the model. Finally, the model can be used to investigate the effects of various types of neurological damage on speaking skills. INTRODUCTION: OVERVIEW OF THE DIVA MODEL Figure 1 provides an overview of the DIVA model (e.g., Guenther, 1994; Guenther et al., 1995; 1998; Perkell et al., 2000; Callan et al., 2000). The model consists of a neural network controller whose cells correspond to boxes and synaptic weights correspond to arrows in the figure. The neural network utilizes a babbling stage to learn the neural mappings (arrows) necessary for controlling an articulatory synthesizer (e.g., Maeda, 1990). The output of the model (labelled “To Muscles” in Figure 1) specifies the positions of the 7 articulators that determine the vocal tract shape in the articulatory synthesizer. Speech Sound Map (Premotor Cortex) Articulator Velocity and Position Maps (Motor Cortex) Auditory Error Map (Auditory Cortex) Somatosensory Error Map (Somatosensory Cortex) Auditory Goal Region Somatosensory Goal Region Somatosensory State Auditory State

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Prediction of Egg Production Using Artificial Neural Network

Artificial neural networks (ANN) have shown to be a powerful tool for system modeling in a wide range of applications. The focus of this study is on neural network applications to data analysis in egg production. An ANN model with two hidden layers, trained with a back propagation algorithm, successfully learned the relationship between the input (age of hen) and output (egg production) variabl...

متن کامل

Multidirectional mappings and the concept of a mental syllabary in a neural model of speech production

As a result from modeling cortical processes of self-organization occuring during speech acquisition, a comprehensive neural model of speech production has been developed by using self-organizing neural networks and feedforward neural networks. This model is capable of generating acoustic speech signals and sensory feedback signals by using a high quality 3-dimensional articulatory-acoustic spe...

متن کامل

[Modeling developmental aspects of sensorimotor control of speech production].

BACKGROUND Detailed knowledge of the neurophysiology of speech acquisition is important for understanding the developmental aspects of speech perception and production and for understanding developmental disorders of speech perception and production. METHOD A computer implemented neural model of sensorimotor control of speech production was developed. The model is capable of demonstrating the...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

A Neural Model of Speech Production

نویسندگان

چکیده

منابع مشابه

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech Emotion Recognition Using Scalogram Based Deep Structure

Prediction of Egg Production Using Artificial Neural Network

Multidirectional mappings and the concept of a mental syllabary in a neural model of speech production

[Modeling developmental aspects of sensorimotor control of speech production].

عنوان ژورنال:

اشتراک گذاری