Continuous Digits Recognition Leveraging Invariant Structure
نویسندگان
چکیده
Recently, an invariant structure of speech was proposed, where the inevitable acoustic variations caused by non-linguistic factors are effectively removed from speech. The invariant structure was applied to isolated word recognition and the experimental results showed good performance. However, the previous method can’t apply to continuous speech recognition directly because there was no efficient decoding algorithm. In this paper, we propose a method to leverage the invariant structure in continuous digits recognition. We use a traditional HMMbased Automatic Speech Recognition (ASR) system to get N best lists with phone alignments. Then we construct invariant structures using these phone alignments and re-rank the N best lists by investigating which hypothesis is structurally more valid. Experimental results show a relative WER improvement of 17.4% over the baseline HMM-based ASR system.
منابع مشابه
A Note on the Greedy Β-transformation with Deleted Digits
In this article we give the attractor and the absolutely continuous, invariant measure of the greedy and lazy β-transformation with deleted digits and show their ergodicity. We will consider two specific examples of greedy β-transformations of which the invariant measure can be explicitly calculated.
متن کاملDiscriminative Reranking for LVCSR Leveraging Invariant Structure
An invariant structure is one of the long-span acoustic representations, where acoustic variations caused by non-linguistic factors are effectively removed from speech. We present in this paper a new method to leverage the invariant structures as features of discriminative reranking for Large Vocabulary Continuous Speech Recognition (LVCSR). First we use a traditional HMMbased LVCSR system to g...
متن کاملA Natural Extension for the Greedy Β-transformation with Three Deleted Digits Karma Dajani and Charlene Kalle
We give an explicit expression for the invariant measure, absolutely continuous with respect to the Lebesgue measure, of the greedy βtransformation with three deleted digits. We define a version of the natural extension of the transformation to obtain this expression. We get that the transformation is exact and weakly Bernoulli.
متن کاملDevelopment of a Real-time Asr System for Slovak Speechdat Database
This paper describes development of a real-time speech recognition system in Slovak for the voice-operated telephone services. The system is based on SPHINX2 platform. The decoder using Hidden Markov Models was trained on the SpeechDat-E Slovak database. It is speaker independent, large vocabulary, continuous speech real-time automatic speech recognition system. Test results are given for the t...
متن کاملA Natural Extension for the Greedy Β-transformation with Three Arbitrary Digits
We construct a planar version of the natural extension of the piecewise linear transformation T generating greedy β-expansions with digits in an arbitrary set of real numbers A = {a0, a1, a2}. As a result, we derive in an easy way a closed formula for the density of the unique T -invariant measure μ absolutely continuous with respect to Lebesgue measure. Furthermore, we show that T is exact and...
متن کامل