A VQ based speaker recognition system based in histogram distances. text independent and for noisy environments

نویسندگان

  • Enric Monte-Moreno
  • Ramon Arqué
  • Xavier Anguera Miró
چکیده

In speaker recognition systems based on VQ, normally each speaker is assigned a codebook, and the classification is done by means of the a distortion distance of the utterance computed by means of each codebook. In [1] we proposed a system which instead of having a codebook for each speaker, had only one codebook for all the speakers, and for each speaker one histogram. This histogram was the occupancy rate of each codeword for a given speaker. This means that the information of the histogram of a given speaker is the probability that the speaker utters the information related to the codeword. So we approximated the pdf of each speaker by the normalized histogram. In this paper we present an exhaustive study of different measures for comparing histograms: Kullbach-Leiber, logdifference of each probability, geometrical distance, and the Euclidean distance. We have done also an exhaustive study of the properties of the system for each distance in the presence of noise (white and colored), and for different parameterizations: LPC, MFCC, LPC-Cepstrum-OSA (One sided autocorrelation sequence), LCP-Cepstrum. (Cepstrum with/without liftering). As the combination of experiments was high, the conclusions were drawn after an analisis of variance (ANOVA), and T-tests. Thus the conclusions, with significance levels, can be drawn about the differences and interactions between kind of. distance, parametrizacion, kind of noise and level of noise.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance improvement of text-independent speaker verification systems based on histogram enhancement in noisy environments

In this paper a histogram enhancement technique is presented in order to improve the robustness of text-independent speaker verification systems. The technique transforms the features extracted from speech such that the contrast of their histogram is enhanced. Experiments showed significant improvements for this technique compared to standard techniques both in clean testing environments, and i...

متن کامل

Robust Speaker Identification System Based on Two-Stage Vector Quantization

This paper presents an effective method for speaker identification system. Based on the wavelet transform, the input speech signal is decomposed into several frequency bands, and then the linear predictive cepstral coefficients (LPCC) of each band are calculated. Furthermore, the cepstral mean normalization technique is applied to all computed features in order to provide similar parameter stat...

متن کامل

Field Evaluation of Text-Dependent Speaker Recognition in an Access Control Application

Vector quantization (VQ) is a widely used matching algorithm for text-independent speaker recognition. In this paper we study the use of text-dependent speaker recognition in practical access control application. We compared dynamic time warping (DTW) to VQ-based matching using textdependent pass phrases. Our goal was to find out, how fixed phrase affects speaker recognition performance. We col...

متن کامل

Robust Text-independent Speaker Identification in a Time-varying Noisy Environment

Practical speaker recognition systems are often subject to noise or distortions within the input speech which degrades performance. In this paper, we proposed a new mel-frequency cepstral coefficients (MFCC) based speaker identification system with Vector Quantization (VQ) modeling technique. It integrates a hearing masking effect based masker and a group of dozen triflers into traditional MFCC...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998