Product of Gaussians as a distributed representation for speech recognition
نویسندگان
چکیده
Distributed representations allow the effective number of Gaussian components in a mixture model, or state of an HMM, to be increased without dramatically increasing the number of model parameters. Various forms of distributed representation have previously been investigated. In this work it shown that the product of experts (PoE) framework may be viewed as a distributed representation when the individual experts are mixtures of Gaussians. However, in contrast to the standard PoE model, the individual experts are not required to be valid distributions, thus allowing additional flexibility in the component priors and variances. The performance of PoE models when used as a distributed representation on a large vocabulary speech recognition task, SwitchBoard, is evaluated.
منابع مشابه
Product of Gaussians for speech recognition
Recently there has been interest in the use of classifiers based on the product of experts (PoE) framework. PoEs offer an alternative to the standard mixture of experts (MoE) framework. It may be viewed as examining the intersection of a series of experts, rather than the union as in the MoE framework. This paper presents a particular implementation of PoEs, the normalised product of Gaussians ...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملReduced gaussian mixture models in a large vocabulary continuous speech recognizer
Large vocabulary continuous speech recognition (LVCSR) systems usually employ several tens of thousands of gaussian mixture components for an accurate statistical representation of naturally spoken human speech. For applications that cannot e ort the computational expensive evaluation of numerous Gaussians during recognition time, it is an important question whether the number of Gaussians can ...
متن کاملVoice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کامل