Discriminative resolution enhancement in acoustic modelling
نویسندگان
چکیده
The accuracy of the acoustic models in large vocabulary recognition systems can be improved by increasing the resolution in the acoustic feature space. This can be obtained by increasing the number of gaussian densities in the models by splitting of the gaussians. This paper proposes a novel algorithm for this splitting operation. It is based on the phonetic decision tree used for the state tying in context dependent modelling. Advantage of the method is that it improves the capability of the acoustic models to discriminate between the different tied states. The proposed splitting algorithm was evaluated on the Wall Street Journal recognition task. Comparison with a commonly used splitting algorithm clearly shows that our method can provide smaller (thus faster) acoustic models and results in lower error rates.
منابع مشابه
Discriminative models for speech recognition
The discriminative approach to speech recognition offers several advantages over the generative, such as a simple introduction of additional dependencies and direct modelling of sentence posterior probabilities/decision boundaries. However, the number of sentences that can possibly be encoded into an observation sequence can be vast, which makes the application of models, such as support vector...
متن کاملInvestigation of Acoustic Modelling Techniques for Lvcsr Systems
The CUHTK evaluation systems typically make use of a multiple pass, multiple branch, framework. This allows a range of acoustic models to be used in the framework and the output from all the systems, or branch, to be combined to give the final output. This paper describes experiments with several advanced acoustic modelling techniques that were candidate approaches for the 2004 CU-HTK large voc...
متن کاملIntegration of Face and Voice Recognition
cepstral features and features based on a bio–mechanical model of the visible articulators will be the identity–carrying characteristics extracted from acoustic speech and visual speech respectively. Speakers will be modelled by multi–layer perceptrons trained as discriminative models or, alternatively, as predictive models. In the discriminative modelling scheme, each speaker model will be tra...
متن کاملSimultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition
A method is proposed to incorporate mixture density splitting into the acoustic model discriminative training for speech recognition. The standard method is to obtain a high resolution acoustic model by maximum likelihood training and density splitting, and then improving this model by discriminative training. We choose a log-linear form of acoustic model because for a single Gaussian density p...
متن کاملDiscriminative semi-parametric trajectory model for speech recognition
Hidden Markov Models (HMMs) are the most commonly used acoustic model for speech recognition. In HMMs, the probability of successive observations is assumed independent given the state sequence. This is known as the conditional independence assumption. Consequently, the temporal (inter-frame) correlations are poorly modelled. This limitation may be reduced by incorporating some form of trajecto...
متن کامل