Universal Background Models for Real-time Speaker Change Detection

نویسندگان

Ting-Yao Wu

Lie Lu

Ke Chen

HongJiang Zhang

چکیده

This paper addresses the problem of real-time speaker change detection in TV news broadcast, in which no prior knowledge on speakers is assumed. To remove the unreliable frames and background frames in the speech stream, we propose a new approach for feature categorization based on Gaussian Mixture Model Universal Background Model (GMM-UBM). The feature vectors are categorized into three sets, which include reliable speech, doubtful speech and unreliable speech. Then a novel distance measure is presented correspondingly for real-time speaker change detection. Extensive experiments demonstrate its good performance, and intrinsic difficulties on real-time speaker change detection are discussed as well in this paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Real-Time Unsupervised Speaker Change Detection

The information of speaker change point is very useful for speaker tracking and other applications. In this paper, we presented an effective algorithm for automatic speaker change detection based on LSP correlation analysis. Moreover, a general case is considered, in which the speaker and speaker number are both assumed unknown. The algorithm has low complexity and can be processed in real-time...

متن کامل

UBM-based incremental speaker adaptation

This paper addresses a novel algorithm of incremental speaker adaptation (ISA) based on universal background model (UBM) for saving storage and real-time processing. This algorithm can be seen as an extension of traditional speaker adaptation. It consists of two steps, adaptation and combination. It not only considers the speaker’s characteristics in limited training data, but also prohibits ov...

متن کامل

Fast speaker change detection for broadcast news transcription and indexing

In this paper, we describe a new speaker change detection algorithm designed for fast transcription and audio indexing of spoken broadcast news. We have designed a two-stage algorithm that begins with a gender-independent phone-class recognition pass. We collapse the phoneme inventory to only 4 broad classes and include 4 different models for non-speech, resulting in a small fast decoder that r...

متن کامل

UBM Based Speaker Segmentation and Clustering for 2-Speaker Detection

In this paper, a speaker segmentation method based on log-likelihood ratio score (LLRS) over universal background model (UBM) and a speaker clustering method based on difference of log-likelihood scores between two speaker models are proposed. During the segmentation process, the LLRS between two adjacent speech segments over UBM is used as a distance measure,while during the clustering process...

متن کامل

Speaker dependent emotion recognition using prosodic supervectors

This work presents a novel approach for detection of emotions embedded in the speech signal. The proposed approach works at the prosodic level, and models the statistical distribution of the prosodic features with Gaussian Mixture Models (GMM) mean-adapted from a Universal Background Model (UBM). This allows the use of GMM-mean supervectors, which are classified by a Support Vector Machine (SVM...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Universal Background Models for Real-time Speaker Change Detection

نویسندگان

چکیده

منابع مشابه

Real-Time Unsupervised Speaker Change Detection

UBM-based incremental speaker adaptation

Fast speaker change detection for broadcast news transcription and indexing

UBM Based Speaker Segmentation and Clustering for 2-Speaker Detection

Speaker dependent emotion recognition using prosodic supervectors

عنوان ژورنال:

اشتراک گذاری