Sub-lexical modelling using a finite state transducer framework

نویسندگان

  • Xiaolong Mou
  • Victor Zue
چکیده

The finite state transducer (FST) approach [1] has been widely used recently as an effective and flexible framework for speech systems. In this framework, a speech recognizer is represented as the composition of a series of FSTs combining various knowledge sources across sub-lexical and high-level linguistic layers. In this paper, we use this FST framework to explore some sublexical modelling approaches, and propose a hybrid model that combines an ANGIE [2] morpho-phonemic model with a lexiconbased phoneme network model. These sub-lexical models are converted to FST representations and can be conveniently composed to build the recognizer. Our preliminary perplexity experiments show that the proposed hybrid model has the advantage of imposing strong constraints to the in-vocabulary words as well as providing detailed sub-lexical syllabification and morphology analysis of the out-of-vocabulary (OOV) words. Thus it has the potential of offering good performance and can better handle the OOV problem in speech recognition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sub-lexical Modelling Using a Finite State Transducer Framework1

The finite state transducer (FST) approach [1] has been widely used recently as an effective and flexible framework for speech systems. In this framework, a speech recognizer is represented as the composition of a series of FSTs combining various knowledge sources across sub-lexical and high-level linguistic layers. In this paper, we use this FST framework to explore some sublexical modelling a...

متن کامل

Context-dependent probabilistic hierarchical sublexical modelling using finite state transducers

This paper describes a unified architecture for integrating sub-lexical models with speech recognition, and a layered framework for context-dependent probabilistic hierarchical sublexical modelling. Previous work [1, 2, 3] has demonstrated the effectiveness of sub-lexical modelling using a core context-free grammar (CFG) augmented with context-dependent probabilistic models. Our major motivatio...

متن کامل

Weighted Finite-State Morphological Analysis of Finnish Inflection and Compounding

Finnish has a very productive compounding and a rich inflectional system, which causes ambiguity in the morphological segmentation of compounds made with finite state transducer methods. In order to disambiguate the compound segmentations, we compare three different strategies, which we cast in a probabilistic framework. We present a method for implementing the probabilistic framework as part o...

متن کامل

Weighted Finite-State Morphological Analysis of Finnish Compounding with HFST-LEXC

Finnish has a very productive compounding and a rich inflectional system, which causes ambiguity in the morphological segmentation of compounds made with finite state transducer methods. In order to disambiguate the compound segmentations, we compare three different strategies, which are all cast in the same probabilistic framework and compared for the first time. We present a method for implem...

متن کامل

Statistical modeling of phonological rules through linguistic hierarchies

This paper describes our research aimed at acquiring a generalized probability model for alternative phonetic realizations in conversational speech. For all of our experiments, we utilize the summit landmark-based speech recognition framework. The approach begins with a set of formal context-dependent phonological rules, applied to the baseforms in the recognizer’s lexicon. A large speech corpu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001