Improved parameter tying for efficient acoustic model evaluation in large vocabulary continuous speech recognition

نویسندگان

  • Jacques Duchateau
  • Kris Demuynck
  • Dirk Van Compernolle
  • Patrick Wambacq
چکیده

In an HMM based large vocabulary continuous speech recognition system, the evaluation of context dependent acoustic models is very time consuming. In Semi-Continuous HMMs, a state is modelled as a mixture of elementary generally gaussian probability density functions. Observation probability calculations of these states can be made faster by reducing the size of the mixture of gaussians used to model them. In this paper, we propose di erent criteria to decide which gaussians should remain in the mixture for a state, and which ones can be removed. The performance of the criteria is compared on context dependent tied state models using the WSJ recognition task. Our novel criterion, which decides to remove a gaussian in a state if it is based on too few acoustic data, outperforms the other described criteria.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Large Vocabulary Continuous Speech Recognition: Improvements in Acoustic Modelling and Search

This paper describes the main improvements we made in two of the basic modules in our HMMbased large vocabulary speaker independent continuous speech recognition system: namely in the acoustic modelling and in the search engine. For the acoustic modelling, we paid special attention both to improved parameter tying at the density and at the state level, and to fast evaluation of the HMMs. For th...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Phone transition acoustic modeling: application to speaker independent and spontaneous speech systems

HMM-based large vocabulary speech recognition systems usually have a very large number of statistical parameters. For better estimation, the number of parameters is reduced by sharing them across models. The parameter sharing is decided by regression trees which are built using phonetic classes designed either by a human expert or by data-driven methods. In situations where neither of these are...

متن کامل

Rule-Based Triphone Mapping for Acoustic Modeling in Automatic Speech Recognition

This paper presents rule-based triphone mapping for acoustic models training in automatic speech recognition. We test if the incorporation of expanded knowledge at the level of parameter tying in acoustic modeling improves the performance of automatic speech recognition in Slovak. We propose a novel technique of knowledge-based triphone tying, which allows the synthesis of unseen triphones. The...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998