Prosodic and Phonetic Features for Speaker Clustering in Speaker Diarization Systems

نویسندگان

  • Janez Zibert
  • France Mihelic
چکیده

This work is focused on speaker clustering methods that are used in speaker diarization systems. The purpose of speaker clustering is to associate together segments that belong to the same speaker and is usually applied in the last stage of the speaker-diarization process. We concentrate on developing proper representations of speaker segments for clustering. We realize two different speaker clustering systems. The first is a standard approach using a bottom-up agglomerative clustering principle with the Bayesian Information Criterion as a merging criterion. In the second system we developed a fusionbased speaker-clustering, where speaker segments are modeled by acoustic and prosodic representations. In this way we additionally model the speaker prosodic and phonetic characteristics and combine them with the basic acoustic information of speakers. This leads to improved clustering of the segments in the case of similar speaker acoustic properties and poor acoustic conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonetic subspace mixture model for speaker diarization

This paper presents an improved distance measure for speaker clustering in speaker diarization systems. The proposed phonetic subspace mixture (PSM) model introduces phonetic information to the BIC distance measure. Therefore, the new PSM model-based BIC distance measure can remove the effect of phonetic content on the diarization results. The typical BIC distance measure can be seen as a speci...

متن کامل

The Detection of Overlapping Speech with Prosodic Features for Speaker Diarization

Overlapping speech is responsible for a certain amount of errors produced by standard speaker diarization systems in meeting environment. We are investigating a set of prosody-based long-term features as a potential complement to our overlap detection system relying on short-term spectral parameters. The most relevant features are selected in a two-step process. They are firstly evaluated and s...

متن کامل

Speeding Up Speaker Diarization by Using Prosodic Features

In this article we present a method to speed up agglomerative clustering used in speaker diarization by using long-term prosodic features. A set of these features is used to decide which clusters should be merged. This strategy reduces the number of decisions that have to be performed using the more calculation-intensive method based on the Bayesian Information Criterion (BIC). We show a speedu...

متن کامل

Multi-stream speaker diarization systems for the meetings domain

In the context of speech and speaker recognition systems, it is well known that the combination of different feature streams can improve significantly their performance. However, the application of multi-stream (MS) techniques to speaker diarization systems has not been extensively studied. In this paper, we address this issue: we formulate different MS techniques, such as feature combination, ...

متن کامل

Speaker diarization of spontaneous meeting room conversations

Speaker diarization is the task of identifying “who spoke when” in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization systems have isolated three main issues with the systems; overlapping speech, effects of background noise and speech/nonspeech detection errors on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011