Speaker detection using multi-speaker audio files for both enrollment and test

نویسندگان

  • Jean-François Bonastre
  • Sylvain Meignier
  • Téva Merlin
چکیده

This paper focuses on speaker detection using multispeaker files both for the enrollment phase and for the test phase. This task was introduced during the 2002 NIST speaker recognition evaluation campaign. Enrollment data is composed of three two-speaker files. Test files are also two-speaker records. The system presented here uses a speaker segmentation process based on an HMM conversation model followed by a speaker matching technique to produce one-speaker segments. Speaker detection is then achieved using AMIRAL, LIA's GMMbased speaker verification system. Validation of the proposed strategy is done using extracts from the NIST 2002 results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Speakers in the Wild Speaker Recognition Challenge Plan

The Speakers in the Wild (SITW) speaker recognition challenge (SRC) is intended to support research toward the real-world application of automatic speaker recognition technology across speech acquired in unconstrained conditions. The SITW SRC will serve to benchmark current technologies in both single and multi-speaker audio with the dataset and annotations being made publicly available (under ...

متن کامل

A semi-automatic approach for speaker mining of tapped telephone conversations

Speaker mining involves speaker detection in a set of multispeaker files. In previous work on speaker mining, training data is used for constructing target speaker models. In this study, a new speaker mining scenario was considered, where there is no demarcation between training and testing data and prior target speaker models are absent. Given the ENRON database which consists of tapped teleph...

متن کامل

Acoustic hole filling for sparse enrollment data using a cohort universal corpus for speaker recognition.

In this study, the problem of sparse enrollment data for in-set versus out-of-set speaker recognition is addressed. The challenge here is that both the training speaker data (5 s) and test material (2~6 s) is of limited test duration. The limited enrollment data result in a sparse acoustic model space for the desired speaker model. The focus of this study is on filling these acoustic holes by h...

متن کامل

Robust Voice Mining Techniques for Telephone Conversations

Title of thesis: ROBUST VOICE MINING TECHNIQUES FOR TELEPHONE CONVERSATIONS Sandeep Manocha, Master of Science, 2006 Thesis directed by: Dr. Carol Y. Espy-Wilson Department of Electrical Engineering Voice mining involves speaker detection in a set of multi-speaker files. In published work, training data is used for constructing target speaker models. In this study, a new voice mining scenario w...

متن کامل

PSO Based Optimized Reliability for Robust Multimodal Speaker Identification

Speaker recognition in real environment with reliable mode is a key challenge for ubiquitous service in human computer interface. In this paper, we present a robust multimodal speaker identification system with optimized reliability of different modalities. We propose an extension of modified convection function’s optimizing factors to account optimum reliability simultaneously in audio, face a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003