Combining evidence from a generative and a discriminative model in phoneme recognition
نویسندگان
چکیده
We investigate the use of the log-likelihood of the features obtained from a generative Gaussian mixture model, and the posterior probability of phonemes from a discriminative multilayered perceptron in multi-stream combination for recognition of phonemes. Multi-stream combination techniques, namely early integration and late integration are used to combine the evidence from these models. By using multi-stream combination, we obtain a phoneme recognition accuracy of 74% on the standard TIMIT database, an absolute improvement of 2.5% over the single best stream.
منابع مشابه
Allophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملAutomatic Social Role Recognition in Professional Meetings
This paper investigates the influence of social roles on the conversation style and linguistic usage of participants in professional meeting recordings. At first, we implement a generative model to capture the sequential nature of conversations in terms of participants, turntaking behavior. In parallel, the system also employs a probabilistic discriminative classifier on a set of high level fea...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملError Pattern Detection Integrating Generative and Discriminative Learning for Computer-Aided Pronunciation Training
Computer-Assisted Language Learning tries to have computers serve as virtual language tutors to help people in learning non-native languages in the globalized world nowadays. In this paper we propose a framework to incorporate specially designed discriminative models with carefully trained generative models for the task of pronunciation error pattern detection. For each phoneme we train one or ...
متن کاملCombining Evidence from Unconstrained Spoken Term Frequency Estimation for Improved Speech Retrieval
Title of dissertation: Combining Evidence from Unconstrained Spoken Term Frequency Estimation for Improved Speech Retrieval J. Scott Olsson, Doctor of Philosophy, 2008 Dissertation directed by: Associate Professor Douglas W. Oard College of Information Studies This dissertation considers the problem of information retrieval in speech. Today’s speech retrieval systems generally use a large vocab...
متن کامل