Agglomerative hierarchical speaker clustering using incremental Gaussian mixture cluster modeling
نویسندگان
چکیده
This paper proposes a novel cluster modeling method for intercluster distance measurement within the framework of agglomerative hierarchical speaker clustering, namely, incremental Gaussian mixture cluster modeling. This method uses a single Gaussian distribution to model each initial cluster, but represents any newly merged cluster using a distribution whose pdf is the weighted sum of the pdf’s of the respective model distributions for the clusters involved in the particular merging process. As a result, clusters are smoothly transitioned to be modeled by Gaussian mixtures whose components are incremented as merging recursions continue during clustering. The proposed method can overcome the limited cluster representation capability of conventional single Gaussian cluster modeling. Through experiments on various sets of initial clusters, it is demonstrated that our approach consequently improves the reliability of speaker clustering performance.
منابع مشابه
Signature cluster model selection for incremental Gaussian mixture cluster modeling in agglomerative hierarchical speaker clustering
Agglomerative hierarchical speaker clustering (AHSC) has been widely used for classifying speech data by speaker characteristics. Its bottom-up, one-way structure of merging the closest cluster pair at every recursion step, however, makes it difficult to recover from incorrect merging. Hence, making AHSC robust to incorrect merging is an important issue. In this paper we address this problem in...
متن کاملAn improved cluster model selection method for agglomerative hierarchical speaker clustering using incremental Gaussian mixture models
In this paper, we improve our previous cluster model selection method for agglomerative hierarchical speaker clustering (AHSC) based on incremental Gaussian mixture models (iGMMs). In the previous work, we measured the likelihood of all the data points in a given cluster for each mixture component of the GMM modeling the cluster. Then, we selected the N -best component Gaussians with the highes...
متن کاملA sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large scale data
An infinite mixture model is applied to model-based speaker clustering with sampling-based optimization to make it possible to estimate the number of speakers. For this purpose, a framework of non-parametric Bayesian modeling is implemented with the Markov chain Monte Carlo and incorporated in the utterance-oriented speaker model. The proposed model is called the utterance-oriented Dirichlet pr...
متن کاملText Independent Speaker Identification Model Using Finite Doubly Truncated Gaussian Distribution and Hierarchical Clustering
In Speaker Identification the goal is to determine which one of a group of a known voice with best matches with the one of the input voices. Modelling the speaker voices is an important consideration for many applications. In developing the model, it is customary to consider that the voice of the individual speaker is characterized with finite component Gaussian mixture model. However, the Mel ...
متن کاملIntegrate template matching and statistical modeling for speech recognition
We propose a novel approach of integrating template matching with statistical modeling to improve continuous speech recognition. We use multiple Gaussian Mixture Model (GMM) indices to represent each frame of speech templates, use hierarchical agglomerative clustering to generate template representatives, and use log likelihood ratio as the local distance measure for DTW template matching in la...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008