How to Add Word Classes to the Kaldi Speech Recognition Toolkit
نویسندگان
چکیده
The paper explains and illustrates how the concept of word classes can be added to the widely used open-source speech recognition toolkit Kaldi. The suggested extensions to existing Kaldi recipes are limited to the word-level grammar (G) and the pronunciation lexicon (L) models. The implementation to modify the weighted finite state transducers employed in Kaldi makes use of the OpenFST library. In experiments on small and mid-sized corpora with vocabulary sizes of 1.5K and 5.5K respectively a slight improvement of the word error rate is observed when the approach is tested with (hand-crafted) word classes. Furthermore it is shown that the introduction of sub-word unit models for open word classes can help to robustly detect and classify out-of-vocabulary words without impairing word recognition accuracy.
منابع مشابه
Combining Semantic Word Classes and Sub-Word Unit Speech Recognition for Robust OOV Detection
Out-of-vocabulary words (OOVs) are often the main reason for the failure of tasks like automated voice searches or humanmachine dialogs. This is especially true if rare but task-relevant content words, e.g. person or location names, are not in the recognizer’s vocabulary. Since applications like spoken dialog systems use the result of the speech recognizer to extract a semantic representation o...
متن کاملFree on-line speech recogniser based on Kaldi ASR toolkit producing word posterior lattices
This paper presents an extension of the Kaldi automatic speech recognition toolkit to support on-line recognition. The resulting recogniser supports acoustic models trained using state-of-theart acoustic modelling techniques. As the recogniser produces word posterior lattices, it is particularly useful in statistical dialogue systems, which try to exploit uncertainty in the recogniser’s output....
متن کاملپایهگذاری بستری نو و کارآمد در حوزه بازشناسی گفتار فارسی
Although researches in the field of Persian speech recognition claim a thirty-year-old history in Iran which has achieved considerable progresses, due to the lack of well-defined experimental framework, outcomes from many of these researches are not comparable to each other and their accurate assessment won’t be possible. The experimental framework includes ASR toolkit and speech database ...
متن کاملOpen Source German Distant Speech Recognition: Corpus and Acoustic Model
We present a new freely available corpus for German distant speech recognition and report speaker-independent word error rate (WER) results for two open source speech recognizers trained on this corpus. The corpus has been recorded in a controlled environment with three different microphones at a distance of one meter. It comprises 180 different speakers with a total of 36 hours of audio record...
متن کاملTper Hcaeser Pidi Implementation of the Standard I-vector System for the Kaldi Speech Recognition Toolkit
This report describes implementation of the standard i-vector-PLDA framework for the Kaldi speech recognition toolkit. The current existing speaker recognition system implementation is based on the Subspace Gaussian Mixture Model (SGMM) technique although it shares many similarities with the standard implementation. In our implementation, we modified the code so that it mimics the standard algo...
متن کامل