Feature-dependent allophone clustering
نویسندگان
چکیده
We propose a novel method for clustering allophones called Feature-Dependent Allophone Clustering (FD-AC) that determines feature-dependent HMM topology automatically. Existing methods for allophone clustering are based on parameter sharing between the allophone models that resemble each other in behaviors of feature vector sequences. However, all the features of the vector sequences may not necessarily have a common allophone clustering structures. It is considered that the vector sequences can be better modeled by allocating the optimal allophone clustering structure to each feature. In this paper, we propose Feature-Dependent Successive State Splitting (FD-SSS) as an implementation of FD-AC. In speaker-dependent continuous phoneme recognition experiments, HMMs created by FD-SSS reduced the error rates by about 10% compared with the conventional HMMs that have a common allophone clustering structure for all the features.
منابع مشابه
HMM state clustering across allophone class boundaries
We present a novel approach to hidden Markov model (HMM) state clustering based on the use of broad phone classes and an allophone class entropy measure. Most state-of-the-art largevocabulary speech recognizers are based on context-dependent (CD) phone HMMs that use Gaussian mixture models for the state-conditioned observation densities. A common approach for robust HMM parameter estimation is ...
متن کاملExtensions to phone-state decision-tree clustering: single tree and tagged clustering
The following article describes two extensions to the \traditional" decision tree methods for clustering allophone HMM states in LVCSR systems. The rst, single tree clustering, combines all allophone states of all phones into a single tree. This can be used to improve performance for very small systems. The single tree clustering structure can also be exploited for speaker and channel adaptatio...
متن کاملParameter tying for flexible speech recognition
This paper presents two parameter tying techniques which enable a trade-off between computational cost and recognition performances of a speaker independent flexible speech recognition system working over the telephone network. Parameter tying is conducted at phonetic and acoustic levels. At the phonetic level, allophone and triphone based phonetic modeling are used simultaneously to achieve th...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملA discriminant measure for model complexity adaptation
1 ABSTRACT We present a discriminant measure that can be used to determine the model complexity in a speech recognition system. In the speech recogition process, given a test feature vector the conditional probability of the feature vector has to be obtained for several al-lophone (sub-phonetic units) classes using a gaussian-mixture density model for each class. The gaussian-mixture models are...
متن کامل