An Acoustic-Phonetic and a Model-Theoretic Analysis of Subspace Distribution Clustering Hidden Markov Models

نویسنده

  • Brian Kan-Wing Mak
چکیده

Abstract. Recently, we proposed a new derivative to conventional continuous density hidden Markov modeling (CDHMM) that we call “subspace distribution clustering hidden Markov modeling” (SDCHMM). SDCHMMs can be created by tying low-dimensional subspace Gaussians in CDHMMs. In tasks we tried, usually only 32–256 subspace Gaussian prototypes were needed in SDCHMM-based system to maintain recognition performance of its original CDHMM-based system — a reduction of Gaussian parameters by one to three orders of magnitude. Consequently, both recognition time and memory were greatly reduced. We also have showed that if the underlying subspace distribution tying structure is known, it may be used to train an SDCHMM-based system with as little as eight minutes of speech from scratch. All the results suggest that there is substantial redundancy in conventional CDHMM and that SDCHMM is a more compact model. In this paper, we analyze the tying structure from two perspectives: from the acoustic-phonetic perspective showing that the tying structure seems to capture prominent relationship among phones; and, from the model-theoretic perspective showing that SDCHMMs, if properly created from CDHMMs, may be preferred over the latter as they are less complex and have the potential of greater generalization power.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Microsoft Word - Hybridmodel2.dot

Today’s state-of-the-art speech recognition systems typically use continuous density hidden Markov models with mixture of Gaussian distributions. Such speech recognition systems have problems; they require too much memory to run, and are too slow for large vocabulary applications. Two approaches are proposed for the design of compact acoustic models, namely, subspace distribution clustering hid...

متن کامل

Subspace distribution clustering hidden Markov model

Most contemporary laboratory recognizers require too much memory to run, and are too slow for mass applications. One major cause of the problem is the large parameter space of their acoustic models. In this paper, we propose a new acoustic modeling methodology which we call subspace distribution clustering hidden Markov modeling (SDCHMM) with the aim at achieving much more compact acoustic mode...

متن کامل

Direct training of subspace distribution clustering hidden Markov model

It generally takes a long time and requires a large amount of speech data to train hidden Markov models for a speech recognition task of a reasonably large vocabulary. Recently, we proposed a compact acoustic model called “subspace distribution clustering hidden Markov model” (SDCHMM) with an aim to save some of the training effort. SDCHMMs are derived from tying continuous density hidden Marko...

متن کامل

Comparison of low footprint acoustic modeling techniques for embedded ASR systems

In this paper we compare the performance of speech recognition systems based on hidden Markov models (HMM) with quantized parameters (qHMMs) and subspace distribution clustering hidden Markov models (SDCHMMs). Both of these HMM types provide similar performance as continuous density HMMs, but with significantly reduced memory requirements (approximately 90% less memory was needed to store the H...

متن کامل

Learning phonetic features from waveforms

Unsupervised learning of broad phonetic classes by infants was simulated using a statistical mixture model. With the phonetic labels removed, hand-transcribed segments from the TIMIT database were used in model-based clustering to obtain data-driven classes. Simple Hidden Markov Models were chosen to be the components of the mixture, with Mel-Cepstral coefficients as the front-end. The sound cl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • I. J. Speech Technology

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2004