Discriminative resolution enhancement in acoustic modelling

نویسندگان

Jacques Duchateau

Kris Demuynck

Patrick Wambacq

چکیده

The accuracy of the acoustic models in large vocabulary recognition systems can be improved by increasing the resolution in the acoustic feature space. This can be obtained by increasing the number of gaussian densities in the models by splitting of the gaussians. This paper proposes a novel algorithm for this splitting operation. It is based on the phonetic decision tree used for the state tying in context dependent modelling. Advantage of the method is that it improves the capability of the acoustic models to discriminate between the different tied states. The proposed splitting algorithm was evaluated on the Wall Street Journal recognition task. Comparison with a commonly used splitting algorithm clearly shows that our method can provide smaller (thus faster) acoustic models and results in lower error rates.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative models for speech recognition

The discriminative approach to speech recognition offers several advantages over the generative, such as a simple introduction of additional dependencies and direct modelling of sentence posterior probabilities/decision boundaries. However, the number of sentences that can possibly be encoded into an observation sequence can be vast, which makes the application of models, such as support vector...

متن کامل

Investigation of Acoustic Modelling Techniques for Lvcsr Systems

The CUHTK evaluation systems typically make use of a multiple pass, multiple branch, framework. This allows a range of acoustic models to be used in the framework and the output from all the systems, or branch, to be combined to give the final output. This paper describes experiments with several advanced acoustic modelling techniques that were candidate approaches for the 2004 CU-HTK large voc...

متن کامل

Integration of Face and Voice Recognition

cepstral features and features based on a bio–mechanical model of the visible articulators will be the identity–carrying characteristics extracted from acoustic speech and visual speech respectively. Speakers will be modelled by multi–layer perceptrons trained as discriminative models or, alternatively, as predictive models. In the discriminative modelling scheme, each speaker model will be tra...

متن کامل

Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition

A method is proposed to incorporate mixture density splitting into the acoustic model discriminative training for speech recognition. The standard method is to obtain a high resolution acoustic model by maximum likelihood training and density splitting, and then improving this model by discriminative training. We choose a log-linear form of acoustic model because for a single Gaussian density p...

متن کامل

Discriminative semi-parametric trajectory model for speech recognition

Hidden Markov Models (HMMs) are the most commonly used acoustic model for speech recognition. In HMMs, the probability of successive observations is assumed independent given the state sequence. This is known as the conditional independence assumption. Consequently, the temporal (inter-frame) correlations are poorly modelled. This limitation may be reduced by incorporating some form of trajecto...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Discriminative resolution enhancement in acoustic modelling

نویسندگان

چکیده

منابع مشابه

Discriminative models for speech recognition

Investigation of Acoustic Modelling Techniques for Lvcsr Systems

Integration of Face and Voice Recognition

Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition

Discriminative semi-parametric trajectory model for speech recognition

عنوان ژورنال:

اشتراک گذاری