Modèles de langage ad hoc pour la reconnaissance automatique de la parole. (Ad-hoc language models for automatic speech recognition)

نویسنده

  • Stanislas Oger
چکیده

The three pillars of an automatic speech recognition system are the lexicon, the language model and the acoustic model. The lexicon provides all the words that can be transcribed, associated with their pronunciation. The acoustic model provides an indication of how the phone units are pronounced, and the language model brings the knowledge of how words are linked. In modern automatic speech recognition systems, the acoustic and language models are statistical. Their estimation requires large volumes of data selected, standardized and annotated. At present, the Web is by far the largest textual corpus available for English and French languages. The data it holds can potentially be used to build the vocabulary and the estimation and adaptation of language model. The work presented here is to propose new approaches to take advantage of this resource in the context of language modeling. The document is organized into two parts. The first deals with the use of the Web data to dynamically update the lexicon of the automatic speech recognition system. The proposed approach consists on increasing dynamically and locally the lexicon only when unknown words appear in the speech. New words are extracted from the Web through the formulation of queries submitted toWeb search engines. The phonetization of the words is obtained by an automatic grapheme-to-phoneme transcriber. The second part of the document presents a new way of handling the information contained on the Web by relying on possibility theory concepts. A Web-based possibilistic language model is proposed. It provides an estition of the possibility of a word sequence from knowledge of the existence of its sub-sequences on the Web. A probabilistic Web-based language model is also proposed. It relies on Web document counts to estimate n-gram probabilities. Several approaches for combining these models with classical models are proposed. The results show that combining probabilistic and possibilistic models gives better results than classical probabilistic models alone. In addition, the models estimated from Web data perform better than those estimated on corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Contribution à l'étude de la variabilité de la voix des personnes âgées en reconnaissance automatique de la parole (Contribution to the study of elderly people's voice variability in automatic speech recognition) [in French]

RÉSUMÉ L’utilisation de la reconnaissance vocale pour l’assistance à la vie autonome se heurte à la difficulté d’utilisation des systèmes de RAP qui ne sont pas prévus à la base pour la voix âgée. Pour caractériser les différences de comportement d’un système de reconnaissance entre les personnes âgées et non-âgées, nous avons étudié quels sont les phonèmes les moins bien reconnus en nous basan...

متن کامل

Continuous space models with neural networks in natural language processing. (Modèles neuronaux pour la modélisation statistique de la langue)

Les modèles de langage ont pour but de caractériser et d’évaluer la qualité des énoncés en langue naturelle. Leur rôle est fondamentale dans de nombreux cadres d’application comme la reconnaissance automatique de la parole, la traduction automatique, l’extraction et la recherche d’information. La modélisation actuellement état de l’art est la modélisation "historique" dite n-gramme associée à d...

متن کامل

Amélioration des Performances des Systèmes Automatiques de Reconnaissance de la Parole pour la Parole Non Native

Résumé Dans cet article nous décrivons une approche pour la reconnaissance automatique de la parole (RAP) non native. Nous proposons deux méthodes pour l’adaptation d’un système de reconnaissance automatique de la parole (SRAP) existant. La première se base sur la modification des modèles acoustiques par l’intègration des modèles de la langue maternelle (LM). Les phonèmes de la langue parlée (L...

متن کامل

Issues in acoustic modeling of speech for automatic speech recognition

Stochastic modeling is a exible method for handling the large variability in speech for recognition applications. In contrast to dynamic time warping where heuris-tic training methods for estimating word templates are used, stochastic modeling allows a probabilistic and automatic training for estimating models. This paper deals with the improvement of stochastic techniques, especially for a bet...

متن کامل

Production models as a structural basis for automatic speech recognition

We postulate in this paper that highly structured speech production models will have much to contribute to the ultimate success of speech recognition in view of the weaknesses of the theoretical foundation underpinning current technology. These weaknesses are analyzed in terms of phonological modeling and of phonetic-interface modeling. We present two probabilistic speech recognition models wit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011