Unsupervised training methods for discriminative language modeling
نویسندگان
چکیده
Discriminative language modeling (DLM) aims to choose the most accurate word sequence by reranking the alternatives output by the automatic speech recognizer (ASR). The conventional (supervised) way of training a DLM requires a large amount of acoustic recordings together with their manual reference transcriptions. These transcriptions are used to determine the target ranks of the ASR outputs, but may be hard to obtain. Previous studies make use of the existing transcribed data to build a confusion model which boosts the training set by generating artificial data: a process known as semi-supervised training. In this study we concentrate on the unsupervised setting where no manual transcriptions are available at all. We propose three ways to determine a sequence that could serve as the missing reference text and two approaches which use this information to (i) determine the ranks of the ASR outputs in order to train the discriminative model directly, and (ii) build a confusion model in order to generate artificial training examples. We compare our techniques with the supervised and the semi-supervised setups. Using the reranking variant of the WER-sensitive perceptron algorithm, we obtain word error rate improvements up to half of those of the supervised case.
منابع مشابه
Phrasal Cohort Based Unsupervised Discriminative Language Modeling
Simulated confusions enable the use of large text-only corpora for discriminative language modeling by hallucinating the likely recognition outputs that each (correct) sentence would be confused with. In [1], a novel approach was introduced to simulate confusions using phrasal cohorts derived directly from recognition output. However, the described approach relied on transcribed speech to deriv...
متن کاملLightly supervised training for risk-based discriminative language models
We propose a lightly supervised training method for a discriminative language model (DLM) based on risk minimization criteria. In lightly supervised training, pseudo labels generated by automatic speech recognition (ASR) are used as references. However, as these labels usually include recognition errors, the discriminative models estimated from such faulty reference labels may degrade ASR perfo...
متن کاملUnsupervised discriminative language modeling using error rate estimator
Discriminative language modeling is a successful approach to improving speech recognition accuracy. However, it requires a large amount of spoken data and manually transcribed reference text for training. This paper proposes an unsupervised training method to overcome this handicap. The key idea is to use an error rate estimator, instead of calculating the true error rate from the reference. In...
متن کاملIntrospective Generative Modeling: Decide Discriminatively
We study unsupervised learning by developing introspective generative modeling (IGM) that attains a generator using progressively learned deep convolutional neural networks. The generator is itself a discriminator, capable of introspection: being able to self-evaluate the difference between its generated samples and the given training data. When followed by repeated discriminative learning, des...
متن کاملMinimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation
Discriminative training for machine translation has been well studied in the recent past. A limitation of the work to date is that it relies on the availability of high-quality in-domain bilingual text for supervised training. We present an unsupervised discriminative training framework to incorporate the usually plentiful target-language monolingual data by using a rough “reverse” translation ...
متن کامل