EWAVES: an efficient decoding algorithm for lexical tree based speech recognition

نویسندگان

  • Patrick Nguyen
  • Luca Rigazio
  • Jean-Claude Junqua
چکیده

We present an optimized implementation of the Viterbi algorithm suitable for small to large vocabulary, and isolated or continuous speech recognition. The Viterbi algorithm is certainly the most popular dynamic programming algorithm used in speech recognition. In this paper we propose a new algorithm that outperforms the Viterbi algorithm in term of complexity and of memory requirements. It is based on the assumption of strictly left to right models and explores the lexical tree in an optimal way, such that book-keeping computation is minimized. The tree is encoded such that children of a node are placed contiguously and in increasing order of memory heap so that the proposed algorithm also optimizes cache usage. Even though the algorithm is asymptotically two times faster that the conventional Viterbi algorithm, in our experiments we measured an improvement of at least three.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A two-layer lexical tree based beam search in continuous Chinese speech recognition

In this paper, an approach to continuous speech recognition based on a two-layer lexical tree is proposed. The search network is maintained by the two-layer lexical tree, in which the first layer reflects the word net and the phone net while the second layer the dynamic programming (DP). Because the acoustic information is tied in the second layer, the memory cost is so small that it has the ab...

متن کامل

Voice Assimilation Phenomenon and Its Implementation in LVCSR System with Lexical Tree and Bigram Language Model

In this paper a LVCSR system with implementation of the Czech voice assimilation phenomenon is proposed. The recognition system uses lexical trees and a bigram language model. The first part of this article is focused on voice assimilation phenomenon description, triphone lexical tree construction, and voice assimilation impact on LVCSR system performance. The second part outlines lexical tree ...

متن کامل

Statistical knowledge based frame synchronous search strategies in continuous speech recognition

In this paper, we propose a novel and efficient search algorithm for the Continuous Speech Recognition (CSR). The proposed algorithm is on the basis of the traditional Frame Synchronous Search (FSS) algorithm. It makes full use of some statistical knowledge, such as the Differential State Dwelling Distribution (DSDD), as one of the control factors for the state transition. It also incorporates ...

متن کامل

An efficient lexical tree search for large vocabulary continuous speech recognition

This paper describes an efficient search algorithm for a high speed and high accuracy LVCSR system. A conventionally used lexical tree search is an efficient method, but has a problem in incorporating the language probability. To solve this problem, we propose in this paper a new efficient search algorithm incorporating the language model structure. In our developed LVCSR, 2-pass search algorit...

متن کامل

Modeling Pronunciation Variation for Cantonese Speech Recognition

Due to the large variability of pronunciation in spontaneous speech, pronunciation modeling becomes a more challenging and essential part in speech recognition. In this paper, we describe two different approaches of pronunciation modeling by using decision tree. At lexical level, a pronunciation variation dictionary is built to obtain alternative pronunciations for each word, in which each entr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000