mfcc

Multi - Devices Hindi Speech Database for Speaker Identification using GMM

2013

Sonu Kumar Mahesh Chandra

Abstract— In this paper, we study the effect on speaker identification (SI) system when speech data is recorded on two different sensors, a HP Pavilion third generation laptop and a Samsung mobile ( S3770K) both with built-in microphone in parallel in a closed room in noise free environment. The database contains 10 Hindi sentences (50-60 seconds speech) and one english sentence (7-8 seconds sp...

متن کامل

Improvement of Text Dependent Speaker Identification System Using Neuro-Genetic Hybrid Algorithm in Office Environmental Conditions

Journal: :CoRR 2009

Md. Rabiul Islam Md. Fayzur Rahman

In this paper, an improved strategy for automated text dependent speaker identification system has been proposed in noisy environment. The identification process incorporates the NeuroGenetic hybrid algorithm with cepstral based features. To remove the background noise from the source utterance, wiener filter has been used. Different speech pre-processing techniques such as start-end point dete...

متن کامل

Robustness to additive noise of locally-normalized cepstral coefficients in speaker verification

2015

Josué Fredes José Novoa Víctor Poblete Simon King Richard M. Stern Néstor Becerra Yoma

In this paper the performance of a new feature set, Locally Normalized Cepstral Coefficients (LNCC) is evaluated for a speaker verification task with short testing utterances in additive noise. The results presented here show that LNCC outperforms baseline MFCC features when SNR is lower than 15 dB. The average relative reduction in EER achieved by LNCC is 33%. The use of LNCC in combination wi...

متن کامل

Distributed Speech Recognition Usin Traps-estimated Manne

2002

Pratibha Jain Brian Kingsbury

In this paper, we investigate the use of TemPoRal PatternS (TRAPS) classifiers for estimating manner of articulation features on the small-vocabulary Aurora-2002 database. By combining a stream of TRAPS-estimated manner features with a stream of noise-robust MFCC features (earlier proposed in the Aurora-2002 evaluation by OGI, ICSI and Qualcomm), we obtain an average absolute improvement of 0.4...

متن کامل

Unsupervised Representation Learning Using Convolutional Restricted Boltzmann Machine for Spoof Speech Detection

2017

Hardik B. Sailor Madhu R. Kamble Hemant A. Patil

Speech Synthesis (SS) and Voice Conversion (VC) presents a genuine risk of attacks for Automatic Speaker Verification (ASV) technology. In this paper, we use our recently proposed unsupervised filterbank learning technique using Convolutional Restricted Boltzmann Machine (ConvRBM) as a frontend feature representation. ConvRBM is trained on training subset of ASV spoof 2015 challenge database. A...

متن کامل

Feature extraction for unit selection in concatenative speech synthesis: comparison between AIM, LPC, and MFCC

2002

Minoru Tsuzaki Hisashi Kawai

A comprehensive computational model of the human auditory peripherals (AIM) was applied to extract basic features of speech sounds aiming at optimal unit selection in concatenative speech synthesis. The performance of AIM was compared to that of a purely physical model (LPC) as well as that of an approximate auditory model (MFCC) by basic perceptual experiments. While a significant advantage of...

متن کامل

Speech Emotion Verification System (sevs) Based on Mfcc for Real Time Applications

2008

Norhaslinda Kamaruddin Abdul Wahab

Human recognizes speech emotions by extracting features from the speech signals received through the cochlea and later passed the information for processing. In this paper we propose the use of Mel-Frequency Cepstral Coefficient (MFCC) to extract the speech emotion information to provide both the frequency and time domain information for analysis. Since features extracted using the MFCC simulat...

متن کامل

Language Recognition on Albayzin 2010 LRE using PLLR features Reconocimiento de la Lengua en Albayzin 2010 LRE utilizando caracteŕısticas PLLR

2013

M. Diez A. Varona M. Penagarikano L. J. Rodriguez-Fuentes G. Bordel

Phone Log-Likelihood Ratios (PLLR) have been recently proposed as alternative features to MFCC-SDC for iVector Spoken Language Recognition (SLR). In this paper, PLLR features are first described, and then further evidence of their usefulness for SLR tasks is provided, with a new set of experiments on the Albayzin 2010 LRE dataset, which features wide-band multi speaker TV broadcast speech on si...

متن کامل

Text Independent Speaker Identification with Principal Component Analysis

2013

D. Vijendra Kumar

Principal Component analysis (PCA) is useful in identifying patterns in data, and expressing data in a manner which highlights their similarities and differences. This concept was extracted to reduce high dimensional Mel‟s Frequency Cepstral Coefficients (MFCC) into low dimensional feature vectors. Since MFCC‟s are high in dimensions and truncation of these dependent coefficients may lead to er...

متن کامل

Optimization of Features Parameters for HMM Phoneme Recognition of TIMIT Corpus

2013

Ines BEN FREDJ Kaïs OUNI

Phoneme is the smallest contrastive unit in the sound system of a language. Moreover, it has a meaningful role in speech recognition. In this study, we are interesting for phonemes recognition of Timit database using HTK toolkit for HMM. The main goal is to determine the optimal parameters for the recognizer. For this reason, different speech analysis techniques were operated such as Mel Freque...

متن کامل