نتایج جستجو برای: ضرایب mfcc

تعداد نتایج: 15840  

2008
Svein Gunnar Pettersen Magne Hallstein Johnsen

In this paper we investigate the use of voice activity detection (VAD) for improving noise models used for cepstral domain minimum mean squared error (MMSE) filtering of noisy speech. Due to the popularity of MFCC features for speech recognition, it is useful to have VAD methods and MMSE filtering algorithms that both work in the MFCC domain. We propose a method for VAD based on the likelihood ...

2005
Hakan Erdoğan

Linear Discriminant Analysis (LDA) followed by a diagonalizing maximum likelihood linear transform (MLLT) applied to spliced static MFCC features yields important performance gains as compared to MFCC+dynamic features in most speech recognition tasks. It is reasonable to regularize LDA transform computation for stability. In this paper, we regularize LDA and heteroschedastic LDA transforms usin...

2009
I. Yücel Özbek Mark Hasegawa-Johnson Mübeccel Demirekler

This work examines the utility of formant frequencies and their energies in acoustic-to-articulatory inversion. For this purpose, formant frequencies and formant spectral amplitudes are automatically estimated from audio, and are treated as observations for the purpose of estimating electromagnetic articulography (EMA) coil positions. A mixture Gaussian regression model with mel-frequency cepst...

1997
Bojan Petek Ove Andersen Paul Dalsgaard

Two systems (Statistical Trajectory Models (STM) and continuous density HMMs) utilizing three preprocessing methodologies (MFCC, RASTA and FBDYN) were evaluated on two databases, namely CTIMIT and the corresponding downsampled TIMIT. Within the bounds of the experimental setup the comparative performance analysis showed that the STM significantly outperforms the HMM system on the CTIMIT databas...

2001
Dušan Macho Climent Nadeu Javier Hernando Jaume Padrell

MFCCs perform well when used for clean speech recognition. However, for noisy speech the recognition rates go down. Augmenting the MFCC feature vector by dynamic features improves both discrimination and robustness of the MFCC-based recognizer. In this paper, we present an alternative para meterization based on the frequency filtering (FF) technique. By using FF, a significant improvement with ...

2004
Tommi Lahti

In this paper, a novel method for beginning of utterance detection is proposed for low complexity ASR systems. Assuming MFCC calculations in the ASR front-end, the additional computational load due to the algorithm is negligible. The algorithm makes use of the delay between the MFCC calculation and decoding process, which is typical in front-ends with feature normalization. The main steps of th...

2013
Sonu Kumar Mahesh Chandra

Abstract— In this paper, we study the effect on speaker identification (SI) system when speech data is recorded on two different sensors, a HP Pavilion third generation laptop and a Samsung mobile ( S3770K) both with built-in microphone in parallel in a closed room in noise free environment. The database contains 10 Hindi sentences (50-60 seconds speech) and one english sentence (7-8 seconds sp...

Journal: :CoRR 2009
Md. Rabiul Islam Md. Fayzur Rahman

In this paper, an improved strategy for automated text dependent speaker identification system has been proposed in noisy environment. The identification process incorporates the NeuroGenetic hybrid algorithm with cepstral based features. To remove the background noise from the source utterance, wiener filter has been used. Different speech pre-processing techniques such as start-end point dete...

2015
Josué Fredes José Novoa Víctor Poblete Simon King Richard M. Stern Néstor Becerra Yoma

In this paper the performance of a new feature set, Locally Normalized Cepstral Coefficients (LNCC) is evaluated for a speaker verification task with short testing utterances in additive noise. The results presented here show that LNCC outperforms baseline MFCC features when SNR is lower than 15 dB. The average relative reduction in EER achieved by LNCC is 33%. The use of LNCC in combination wi...

2002
Pratibha Jain Brian Kingsbury

In this paper, we investigate the use of TemPoRal PatternS (TRAPS) classifiers for estimating manner of articulation features on the small-vocabulary Aurora-2002 database. By combining a stream of TRAPS-estimated manner features with a stream of noise-robust MFCC features (earlier proposed in the Aurora-2002 evaluation by OGI, ICSI and Qualcomm), we obtain an average absolute improvement of 0.4...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید