Phonetic, idiolectal and acoustic speaker recognition

نویسندگان

  • Walter D. Andrews
  • Mary A. Kohler
  • Joseph P. Campbell
  • John J. Godfrey
چکیده

This paper describes a text-independent speaker recognition system that achieves an equal error rate of less than 1% by combining phonetic, idiolect, and acoustic features. The phonetic system is a novel language-independent speakerrecognition system based on differences among speakers in dynamic realization of phonetic features (i.e., pronunciation), rather than spectral differences in voice quality. The system exploits phonetic information from six languages to perform text-independent speaker recognition. The idiolectal system models speaker idiosyncrasies with word n-gram frequency counts computed from the output of an automatic speech recognition system. The acoustic system is a Gaussian Mixture Model-Universal Background Model that exploits the spectral differences in voice quality. All experiments were performed on the NIST 2001 Speaker Recognition Evaluation Extended Data Task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Support vector machine fusion of idiolectal and acoustic speaker information in Spanish conversational speech

This paper proposes a Support Vector Machine (SVM) based combining scheme that incorporates ideolectal and acoustic characteristics for speaker recognition. Two statistical model paradigms, namely GMM for acoustic modeling and Bigrams for language modeling, provide multilevel speaker information that affords a better classification performance when SVM-based fusion is accomplished. This combini...

متن کامل

Generative Acoustic-Phonemic-Speaker Model Based on Three-Way Restricted Boltzmann Machine

In this paper, we argue the way of modeling speech signals based on three-way restricted Boltzmann machine (3WRBM) for separating phonetic-related information and speaker-related information from an observed signal automatically. The proposed model is an energy-based probabilistic model that includes three-way potentials of three variables: acoustic features, latent phonetic features, and speak...

متن کامل

On the Relationship between Phone Phonetic Speaker Recog

Speaker recognition techniques have traditionally relied on purely acoustic features and models. During the last few years, however, the field of speaker recognition has started to show interest in the use of higher level features. In particular, phonetic decodings modeled with statistical language models (n-grams) have already shown its effectiveness in several research works. However, the rel...

متن کامل

Speaker recognition based on idiolectal differences between speakers

“Familiar” speaker information is explored using non-acoustic features in NIST’s new “extended data” speaker detection task.[1] Word unigrams and bigrams, used in a traditional target/background likelihood ratio framework, are shown to give surprisingly good performance. Performance continues to improve with additional training and/or test data. Bigram performance is also found to be a function...

متن کامل

Text-based Speaker Identification on Multiparty Dialogues Using Multi-document Convolutional Neural Networks

We propose a convolutional neural network model for text-based speaker identification on multiparty dialogues extracted from the TV show, Friends. While most previous works on this task rely heavily on acoustic features, our approach attempts to identify speakers in dialogues using their speech patterns as captured by transcriptions to the TV show. It has been shown that different individual sp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001