Speaker Diarization using Unsupervised Compensation of Within-Speaker Variability

نویسنده

  • Hagai Aronowitz
چکیده

This paper presents a novel framework for unsupervised compensation of within-speaker variability in the context of speaker diarization. Audio session is divided into overlapping short segments, each one parameterized by a GMM-supervector. For each session independently within-speaker variability is estimated in an unsupervised manner, and is compensated using the nuisance attribute projection (NAP) method. The proposed method was evaluated in the context of speaker diarization in two-speaker telephone conversations and achieved a speaker error rate of 2.8% which is a 54% relative error reduction compared to a baseline BIC-based system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Compensation of Intra-Session Intra-Speaker Variability for Speaker Diarization

This paper presents a novel framework for unsupervised compensation of intra-session intra-speaker variability in the context of speaker diarization. Audio files are parameterized by sequences of GMM-supervectors representing overlapping short segments of speech. Session-dependent intra-session intra-speaker variability is estimated in an unsupervised manner, and is compensated using the nuisan...

متن کامل

Speaker Diarization Based on Gmm Supervectors and Unsupervised Intra-speaker Variability Modeling

This paper presents a novel framework for speaker diarization. Audio is parameterized by a sequence of GMM-supervectors representing overlapping short segments of speech. Session dependent intra-session intra-speaker variability is estimated online in an unsupervised manner, and is removed from the supervectors using Nuisance Attribute Projection (NAP) The supervectors are then projected using ...

متن کامل

Extending the Task of Diarization to Speaker Attribution

In this paper we extend the concept of speaker annotation within a single-recording, or speaker diarization, to a collection wide approach we call speaker attribution. Accordingly, speaker attribution is the task of clustering expectantly homogenous intersession clusters obtained using diarization according to common cross-recording identities. The result of attribution is a collection of spoke...

متن کامل

Domain Adaptation of PLDA Models in Broadcast Diarization by Means of Unsupervised Speaker Clustering

This work presents a new strategy to perform diarization dealing with high variability data, such as multimedia information in broadcast. This variability is highly noticeable among domains (inter-domain variability among chapters, shows, genres, etc.). Therefore, each domain requires its own specific model to obtain the optimal results. We propose to adapt the PLDA models of our diarization sy...

متن کامل

Speaker Diarization Using Unsupervised Discriminant Analysis of Inter-channel Delay Feature

When multiple microphones are available estimates of inter-channel delay, which characterise a speaker’s location, can be used as features for speaker diarization. Background noise and reverberation can, however, lead to noisy features and poor performance. To ameliorate these problems, this paper presents a new approach to the discriminant analysis of delay features for speaker diarization. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011