نتایج جستجو برای: speech learning model

تعداد نتایج: 2641683  

1999
Li Deng Jeff Z. Ma

A statistical coarticulatory model is presented for spontaneous speech recognition, where knowledge of the dynamic, target-directed behavior in the vocal tract resonance responsible for the production of highly coarticulated speech is incorporated into the recognizer design, training, and in likelihood computation. The principal advantage of the new speech model over the conventional HMM is the...

2017
Jen-Tzung Chien Kuan-Ting Kuo

We present a new stochastic learning machine for speech separation based on the variational recurrent neural network (VRNN). This VRNN is constructed from the perspectives of generative stochastic network and variational auto-encoder. The idea is to faithfully characterize the randomness of hidden state of a recurrent neural network through variational learning. The neural parameters under this...

2016
Youngjune Gwon William M. Campbell Douglas E. Sturim H. T. Kung

Spoken language recognition requires a series of signal processing steps and learning algorithms to model distinguishing characteristics of different languages. In this paper, we present a sparse discriminative feature learning framework for language recognition. We use sparse coding, an unsupervised method, to compute efficient representations for spectral features from a speech utterance whil...

Journal: :CoRR 2017
Karttikeya Mangalam Tanaya Guha

We investigate the effect and usefulness of spontaneity in speech (i.e. whether a given speech data is spontaneous or not) in the context of emotion recognition. We hypothesize that emotional content in speech is interrelated with its spontaneity, and thus propose to use spontaneity classification as an auxiliary task to the problem of emotion recognition. We propose two supervised learning set...

2004
Gerardo AYALA

Second language learning is as a personalized, collaborative and lifelong activity. In this paper we present a model of a second language learning environment based on software agents, integrating speech technologies and natural language processing from this perspective. We have developed this model based on previous research concerning technologies for the development of software agents suppor...

Journal: :journal of medical signals and sensors 0
yasser shekofteh shahriar gharibzadeh farshad almasganj

the speech is an easily accessible signal which clearly represents the characteristics of larynx and vocal folds. therefore, application of some proper machine learning algorithms on a small part of a recorded speech signal may help in non-invasive diagnosing of vocal fold diseases. since there are some experimental evidences that suggest the existence of chaotic behavior in speech production s...

There is little understanding of the processes and means involved in teacher learning. Further research is needed to better understand what happens when teachers learn. Therefore, using sociocultural theory (SCT) as a theoretical framework, English for Academic Purposes (EAP) teacher learning was documented and examined. To achieve this goal, an in-depth description of nine in-service EAP teach...

2014
Gabriel Synnaeve Maarten Versteegh Emmanuel Dupoux

This paper explores the possibility to learn a semantically-relevant lexicon from images and speech only. For this, we train a multi-modal neural network working both on image fragments and on speech features, by learning an embedding in which images and content words that co-occur together are close. Making no assumption on the acoustic model, this paper shows promising results on how multi-mo...

2016
Jun Deng

Speech Emotion Recognition (SER) has achieved some substantial progress in the past few decades since the dawn of emotion and speech research. In many aspects, various research efforts have been made in an attempt to achieve human-like emotion recognition performance in real-life settings. However, with the availability of speech data obtained from different devices and varied acquisition condi...

Journal: :CoRR 2017
Sibo Tong Philip N. Garner Hervé Bourlard

Phoneme-based multilingual training and different crosslingual adaptation techniques for Automatic Speech Recognition (ASR) are explored in Connectionist Temporal Classification (CTC)-based systems. The multilingual model is trained to model a universal IPA-based phone set using CTC loss function. While the same IPA symbol may not correspond to acoustic similarity, Learning Hidden Unit Contribu...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید