Semi-continuous segmental probability modeling for continuous speech recognition
نویسندگان
چکیده
In this paper the design of semi-continuous segmental probability models (SCSPMs) in large vocabulary continuous speech recognition is presented. The tied Gaussian densities are trained using data from all states of all utterances while the mixture weights are estimated using data from the state being trained individually. The SCSPMs tie all the densities of all states from all Speech Recognition Units (SRUs) to form a shared pdf codebook, thus the number of Gaussian densities is greatly reduced. Several pruning methods are reviewed and then a new pruning criterion is proposed in order to reduce the number of tied mixture Gaussian densities while there is only a small subset of mixture Gaussian densities with larger tying weights. Our preliminary experiments show that the SCSPM incorporated with the pruning techniques can lessen the size of model storage and speed up the system with little degradation in the accuracy compared to the prior continuous model.
منابع مشابه
Semi-continuous segmental probability model for speech signals
A semi-continuous segmental probability model, which can be considered as a special form of continuous mixture segmental probability model with continuous output probability density functions sharing in a mixture Gaussian density codebook, is proposed in this paper. The amount of training data required, as well as the computational complexity of the semi-continuous segmental probability model(S...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملA segmental approach to text-independent speaker verification
Current text-independent speaker veri cation systems are usually based on modeling globally the probability density function (PDF) of the speaker feature vectors. In this paper, segmental approaches to text-independent speaker veri cation are discussed. Unlike the schemes based on Large Vocabulary Continuous Speech Recognition (LVCSR) with previously trained phone models, our systems are based ...
متن کاملMultiple codebook semi-continuous hidden Markov models for speaker-independent continuous speech recognition
A semi-continuous hidden Markov model based on the multiple vector quantization codebooks is used here for large-vocabulary speaker-independent continuous speech recognition. In the techniques employed here, the semi-continuous output probability density function for each codebook is represented by a combination of the corresponding discrete output probabilities of the hidden Markov model and t...
متن کاملSegment-Based Acoustic Models for Continuous Speech Recognition
ity or acoustic observations conditioned on the state in Tied-mixture (or semi-continuous) distributions are an imhidden-Markov models (11MM), or for the case of the portant tool for acoustic modeling, used in many highSSM, conditioned on a region of the model. Some of the performance speech recognition systems today. This paper options that have been investigated include discrete dispiovides a...
متن کامل