An hybrid language model for a continuous dictation prototype

نویسندگان

  • Kamel Smaïli
  • Imed Zitouni
  • François Charpillet
  • Jean Paul Haton
چکیده

This paper describes the combination of a stochastic language model and a formal grammar modelled such as a unification grammar. The stochastic model is trained over 42 million words extracted from Le monde newspaper. The stochastic model is based on smoothed 3-gram and 3-class. The 3-class model is represented by a Markov chain made up of four states. Several experiments have been done to state which values are the best for specific training and test corpus. Experiments indicate that the unification grammar reduce strongly the number of hypothesis (sentences) produced by the stochastic model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A multi-pass error detection and correction framework for Mandarin LVCSR

We previously proposed a multi-pass framework for Large Vocabulary Continuous Speech Recognition (LVCSR). The objective of this framework is to apply sophisticated linguistic models for recognition, while maintaining a balance between complexity and efficiency. The framework is composed of three passes: initial recognition, error detection and error correction. This paper presents and evaluates...

متن کامل

Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data

This correspondence presents the first known results of complete recognition of continuous Mandarin speech for the Chinese language with very large vocabulary but very limited training data. Various acoustic and linguistic processing techniques were developed, and a prototype system of a continuous speech Mandarin dictation machine has been successfully implemented. The best recognition accurac...

متن کامل

Voltage Regulation of DC-DC Series Resonant Converter Operating in Discontinuous Conduction Mode: The Hybrid Control Approach

Dynamic modeling and control of dc-dc series resonant converter (SRC) especially when operating in discontinuous conduction mode (DCM) is still a challenge in power electronics. Due to semiconductors switching, SRC is naturally represented as a switched linear system, a class of hybrid systems. Nevertheless, the hybrid nature of the SRC is commonly neglected and it is modeled as a purely contin...

متن کامل

Continuous speech dictation in French

A major research activity at LIMSI is multilingual, speaker-independent, large vocabulary speech dictation. In this paper we report on efforts in large vocabulary, speaker-independent continuous speech recognition of French using the BREF corpus. Recognition experiments were carried out with vocabularies containing up to 20k words. The recognizer makes use of continuous density HMM with Gaussia...

متن کامل

IPA Japanese Dictation Free Software Project

Large vocabulary continuous speech recognition (LVCSR) is an important basis for the application development of speech recognition technology. We had constructed Japanese common LVCSR speech database and have been developing sharable Japanese LVCSR programs/models by the volunteer-based efforts. We have been engaged in the following two volunteer-based activities. a) IPSJ (Information Processin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997