Signature cluster model selection for incremental Gaussian mixture cluster modeling in agglomerative hierarchical speaker clustering
نویسندگان
چکیده
Agglomerative hierarchical speaker clustering (AHSC) has been widely used for classifying speech data by speaker characteristics. Its bottom-up, one-way structure of merging the closest cluster pair at every recursion step, however, makes it difficult to recover from incorrect merging. Hence, making AHSC robust to incorrect merging is an important issue. In this paper we address this problem in the framework of AHSC based on incremental Gaussian mixture models, which we previously introduced for better representing variable cluster size. Specifically, to minimize contamination in cluster models by heterogeneous data, we select and keep updating a representative (or signature) model for each cluster during AHSC. Experiments on meeting speech excerpts (4 hours total) verify that the proposed approach improves average speaker clustering performance by approximately 20% (relative).
منابع مشابه
An improved cluster model selection method for agglomerative hierarchical speaker clustering using incremental Gaussian mixture models
In this paper, we improve our previous cluster model selection method for agglomerative hierarchical speaker clustering (AHSC) based on incremental Gaussian mixture models (iGMMs). In the previous work, we measured the likelihood of all the data points in a given cluster for each mixture component of the GMM modeling the cluster. Then, we selected the N -best component Gaussians with the highes...
متن کاملAgglomerative hierarchical speaker clustering using incremental Gaussian mixture cluster modeling
This paper proposes a novel cluster modeling method for intercluster distance measurement within the framework of agglomerative hierarchical speaker clustering, namely, incremental Gaussian mixture cluster modeling. This method uses a single Gaussian distribution to model each initial cluster, but represents any newly merged cluster using a distribution whose pdf is the weighted sum of the pdf’...
متن کاملA sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large scale data
An infinite mixture model is applied to model-based speaker clustering with sampling-based optimization to make it possible to estimate the number of speakers. For this purpose, a framework of non-parametric Bayesian modeling is implemented with the Markov chain Monte Carlo and incorporated in the utterance-oriented speaker model. The proposed model is called the utterance-oriented Dirichlet pr...
متن کاملText Independent Speaker Identification Model Using Finite Doubly Truncated Gaussian Distribution and Hierarchical Clustering
In Speaker Identification the goal is to determine which one of a group of a known voice with best matches with the one of the input voices. Modelling the speaker voices is an important consideration for many applications. In developing the model, it is customary to consider that the voice of the individual speaker is characterized with finite component Gaussian mixture model. However, the Mel ...
متن کاملAn online incremental speaker adaptation method using speaker-clustered initial models
We previously proposed an incremental speaker adaptation method combined with automatic speaker-change detection for broadcast news transcription where speakers change frequently and each of them utters a series of several sentences. In this method, the speaker change is detected using speaker-independent and speaker-adaptive Gaussian mixture models (GMMs). Both phone HMMs and GMMs are incremen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009