Hidden Markov Model Based Speech Activity Detection for the ICSI Meeting Project

نویسندگان

  • Thilo Pfau
  • Daniel P.W. Ellis
چکیده

As part of a project into speech recognition in meeting environments, we have collected a corpus of multi-channel meeting recordings. We expected the identification of speaker activity to be straightforward given that the participants had individual microphones, but simple approaches yielded unacceptably erroneous labelings, mainly due to crosstalk between nearby speakers and wide variations in channel characteristics. We have therefore developed a more sophisticated approach for multichannel speaker activity detection based on a simple hidden Markov model (HMM). A baseline HMM speech activity detector has been extended to use mixtures of Gaussians to achieve robustness for different speakers under different conditions. To further improve the channel independence, normalized features are used. The use of the proposed energy normalization yields a relative reduction in frame error rate by 26.4%. In a postprocessing step the crosscorrelation between different channels is used to detect crosstalk. Using this postprocessing step results in a further reduction of the frame error rate by 12.4%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multispeaker Speech Activity Detection for the Icsi Meeting Recorder

As part of a project into speech recognition in meeting environments, we have collected a corpus of multi-channel meeting recordings. We expected the identification of speaker activity to be straightforward given that the participants had individual microphones, but simple approaches yielded unacceptably erroneous labelings, mainly due to crosstalk between nearby speakers and wide variations in...

متن کامل

The AMI Speaker Diarization System for NIST RT06s Meeting Data

We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker detection tasks. The speaker diarization systems are based on the TNO and ICSI system submitted for RT05s...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

Intrusion Detection Using Evolutionary Hidden Markov Model

Intrusion detection systems are responsible for diagnosing and detecting any unauthorized use of the system, exploitation or destruction, which is able to prevent cyber-attacks using the network package analysis. one of the major challenges in the use of these tools is lack of educational patterns of attacks on the part of the engine analysis; engine failure that caused the complete training,  ...

متن کامل

Speech/laughter classification in meeting audio

In this paper, harmonicity information is incorporated into acoustic features to detect laughter segments and speech segments. We implement our system using HMM (Hidden Markov Models) classifier trained on Pitch and Harmonic Frequency Scale based subband filters (PHFS). Harmonicity of the signal can be determined by variation of the pitch and harmonics. The cascaded subband filters are used to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001