Improved covariance modeling for GMM in speaker identification
نویسندگان
چکیده
Gaussian Mixture Model (GMM) with diagonal covariance matrix is commonly used in textindependent speaker identification. However, diagonal covariance matrix implies strong assumption that the feature elements are independent. Even Gaussian mixtures with diagonal covariance can model the correlation to some extent; the model precision is still limited. To alleviate this problem, this paper proposes a framework for sharing linear transformations among the components and introduces a new unsupervised hierarchical clustering algorithm to implement it. In the framework, the full covariance of each component is represented by shared linear transformation and component-specific diagonal covariance. Different linear transformation estimation approaches, i.e., PCA, LDA and MLLT, are proposed and compared. Experiments show that our algorithm using each of the approaches has achieved significant identification error reduction over the best diagonal covariance models.
منابع مشابه
Distance-based Gaussian mixture model for speaker recognition over the telephone
The paper considers text independent speaker identification over the telephone using short training and testing data. Gaussian Mixture Modeling (GMM) is used in the testing phase, but the parameters of the model are taken from clusters obtained for the training data by an adequate choice of feature vectors and a distance measure without optimization in the maximum likelihood (ML) sense. This di...
متن کاملOn the use of orthogonal GMM in speaker recognition
The Gaussian mixture modeling (GMM) techniques are increasingly being used for both speaker identification and verification. Most of these models assume diagonal covariance matrices. Although empirically any distribution can be approximated with a diagonal GMM, a large number of mixture components are usually needed to obtain a good approximation. A consequence of using a large GMM is that its ...
متن کاملGmm Based on Local Robust Pca for Speaker Identification
ABSTRACT: To solve the problems of outliers and high dimensionality of training feature vectors in speaker identification, in this paper, we propose an efficient GMM based on local robust PCA with VQ. The proposed method firstly partitions the data space into several disjoint regions by VQ, and then performs robust PCA using the iteratively reweighted covariance matrix in each region. Finally, ...
متن کاملLocal fuzzy PCA based GMM with dimension reduction on speaker identification
To reduce the high dimensionality required for training of feature vectors in speaker identification, we propose an efficient GMM based on local PCA with fuzzy clustering. The proposed method firstly partitions the data space into several disjoint clusters by fuzzy clustering, and then performs PCA using the fuzzy covariance matrix on each cluster. Finally, the GMM for speaker is obtained from ...
متن کاملUsing second order statistics for text independent speaker verification
This paper describes a computationally simple method to perform text independent speaker verification using second order statistics. The suggested method, called Utterance Level Scoring (ULS), allows obtaining a normalized score using a single pass through the frames of the tested utterance. The utterance sample covariance is first calculated and then compared to the speaker covariance using a ...
متن کامل