How to Add Word Classes to the Kaldi Speech Recognition Toolkit

نویسندگان

  • Axel Horndasch
  • Caroline Kaufhold
  • Elmar Nöth
چکیده

The paper explains and illustrates how the concept of word classes can be added to the widely used open-source speech recognition toolkit Kaldi. The suggested extensions to existing Kaldi recipes are limited to the word-level grammar (G) and the pronunciation lexicon (L) models. The implementation to modify the weighted finite state transducers employed in Kaldi makes use of the OpenFST library. In experiments on small and mid-sized corpora with vocabulary sizes of 1.5K and 5.5K respectively a slight improvement of the word error rate is observed when the approach is tested with (hand-crafted) word classes. Furthermore it is shown that the introduction of sub-word unit models for open word classes can help to robustly detect and classify out-of-vocabulary words without impairing word recognition accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Semantic Word Classes and Sub-Word Unit Speech Recognition for Robust OOV Detection

Out-of-vocabulary words (OOVs) are often the main reason for the failure of tasks like automated voice searches or humanmachine dialogs. This is especially true if rare but task-relevant content words, e.g. person or location names, are not in the recognizer’s vocabulary. Since applications like spoken dialog systems use the result of the speech recognizer to extract a semantic representation o...

متن کامل

Free on-line speech recogniser based on Kaldi ASR toolkit producing word posterior lattices

This paper presents an extension of the Kaldi automatic speech recognition toolkit to support on-line recognition. The resulting recogniser supports acoustic models trained using state-of-theart acoustic modelling techniques. As the recogniser produces word posterior lattices, it is particularly useful in statistical dialogue systems, which try to exploit uncertainty in the recogniser’s output....

متن کامل

پایه‌گذاری بستری نو و کارآمد در حوزه بازشناسی گفتار فارسی

Although researches in the field of Persian speech recognition  claim  a  thirty-year-old  history in Iran  which has achieved considerable progresses, due to the lack of well-defined experimental framework, outcomes from many of these researches are not comparable to each other and their accurate assessment won’t be possible. The experimental framework includes ASR toolkit and speech database ...

متن کامل

Open Source German Distant Speech Recognition: Corpus and Acoustic Model

We present a new freely available corpus for German distant speech recognition and report speaker-independent word error rate (WER) results for two open source speech recognizers trained on this corpus. The corpus has been recorded in a controlled environment with three different microphones at a distance of one meter. It comprises 180 different speakers with a total of 36 hours of audio record...

متن کامل

Tper Hcaeser Pidi Implementation of the Standard I-vector System for the Kaldi Speech Recognition Toolkit

This report describes implementation of the standard i-vector-PLDA framework for the Kaldi speech recognition toolkit. The current existing speaker recognition system implementation is based on the Subspace Gaussian Mixture Model (SGMM) technique although it shares many similarities with the standard implementation. In our implementation, we modified the code so that it mimics the standard algo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016