speech learning model

Embedding-Based Speaker Adaptive Training of Deep Neural Networks

2017

Xiaodong Cui Vaibhava Goel George Saon

An embedding-based speaker adaptive training (SAT) approach is proposed and investigated in this paper for deep neural network acoustic modeling. In this approach, speaker embedding vectors, which are a constant given a particular speaker, are mapped through a control network to layer-dependent elementwise affine transformations to canonicalize the internal feature representations at the output...

متن کامل

Rehabilitation Approaches for Drug Abuse, Addiction and Pediatric Issues

Journal: Iranian Rehabilitation Journal 2015

Asghar Dadkhah,

The current issue of the Iranian Rehabilitation Journal contains original research evaluating the efficacy of addiction rehabilitation an evaluation of a child rehabilitation system for community based research, reading program for children with down syndrome, auditory stream segregation in auditory processing disorder, speech and language disorders, quality of life of adolescents with hearing ...

متن کامل

Adaptive training using discriminative mapping transforms

2008

Chandra Kant Raut Kai Yu Mark J. F. Gales

Speaker adaptive training (SAT) is a useful technique for building speech recognition systems on non-homogeneous data. When combining SAT with discriminative training criteria, maximum likelihood (ML) transforms are often used for unsupervised adaptation tasks. This is because discriminatively estimated transforms are highly sensitive to errors in the supervision hypothesis. In this paper, spea...

متن کامل

Factorized Linear Input Network for Acoustic Model Adaptation in Noisy Conditions

2016

Dung T. Tran Marc Delcroix Atsunori Ogawa Tomohiro Nakatani

Deep neural network (DNN) based acoustic models have obtained remarkable performance for many speech recognition tasks. However, recognition performance still remains too low in noisy conditions. To address this issue, a speech enhancement front-end is often used before recognition. Such a frontend can reduce noise but there may remain a mismatch due to the difference in training and testing co...

متن کامل

A Comparative Study of RPCL and MCE Based Discriminative Training Methods for LVCSR

Journal: :Neurocomputing 2011

Zaihu Pang Xihong Wu Lei Xu

This paper presents a comparative study of two discriminative methods, i.e., Rival Penalized Competitive Learning (RPCL) and Minimum Classification Error (MCE), for the tasks of Large Vocabulary Continuous Speech Recognition (LVCSR). MCE aims at minimizing a smoothed sentence error on training data, while RPCL focuses on avoiding misclassification through enforcing the learning of correct class...

متن کامل

Experiments on hiwire database using denoising and adaptation with a hybrid HMM-ANN model

2007

Roberto Gemello Franco Mana Stefano Scanzio

This paper presents the results of a large number of experiments performed on the Hiwire cockpit database with a hybrid HMM-ANN speech recognition model. The Hiwire database is a noisy and non-native English speech corpus for cockpit communication. The noisy component of the database has been used to test two noise reduction methods recently introduced, while the adaptation component is exploit...

متن کامل

Hidden Markov Model Based Animal Acoustic Censusing: Learning from Speech Processing Technology

2008

Michael T. Johnson

Individually distinct acoustic features have been observed in a wide range of vocally active animal species and have been used to study animals for decades. Only a few studies, however, have attempted to examine the use of acoustic identification of individuals to assess population, either for evaluating the population structure, population abundance and density, or for assessing animal seasona...

متن کامل

Optimizing DNN Adaptation for Recognition of Enhanced Speech

2017

Marco Matassoni Alessio Brutti Daniele Falavigna

Speech enhancement directly using deep neural network (DNN) is of major interest due to the capability of DNN to tangibly reduce the impact of noisy conditions in speech recognition tasks. Similarly, DNN based acoustic model adaptation to new environmental conditions is another challenging topic. In this paper we present an analysis of acoustic model adaptation in presence of a disjoint speech ...

متن کامل

Pragmatic Representations in Iranian High School English Textbooks

Journal: زبانشناسی کاربردی 2015

Elaheh Zaferanieh Seyed Mohammad Hosseini-Maasoum

Owing to the growing interest in communicative, cultural and pragmatic aspects of second language learning in recent years, the present study tried to investigate representations of pragmatic aspects of English as a foreign language in Iranian high school textbooks. Using Halliday’s (1978), and Searle’s (1976) models, different language functions and speech acts were specifically determined and...

متن کامل

Two-pass decision tree construction for unsupervised adaptation of HMM-based synthesis models

2009

Matthew Gibson

Hidden Markov model (HMM) -based speech synthesis systems possess several advantages over concatenative synthesis systems. One such advantage is the relative ease with which HMM-based systems are adapted to speakers not present in the training dataset. Speaker adaptation methods used in the field of HMM-based automatic speech recognition (ASR) are adopted for this task. In the case of unsupervi...

متن کامل