mel frequency cepstral coefficient mfcc

Optimization of Features Parameters for HMM Phoneme Recognition of TIMIT Corpus

2013

Ines BEN FREDJ Kaïs OUNI

Phoneme is the smallest contrastive unit in the sound system of a language. Moreover, it has a meaningful role in speech recognition. In this study, we are interesting for phonemes recognition of Timit database using HTK toolkit for HMM. The main goal is to determine the optimal parameters for the recognizer. For this reason, different speech analysis techniques were operated such as Mel Freque...

متن کامل

Comparative study of GMM, DTW, and ANN on Thai speaker identification system

2000

Chularat Tanprasert Varin Achariyakulporn

This paper proposes a new investigation on Gaussian mixture model (GMM) by comparing it with some preliminary experiments on multilayered perceptron network (MLP) with backpropagation learning algorithm (BKP) and dynamic time warping (DTW) techniques on Thai text-dependent speaker identification system. Three major identification engines are conducted on 50 speakers with isolated digits 0-9. Tr...

متن کامل

Spectral Analysis of Speech: A New Technique

2006

ICA which is generally used for blind source separation problem has been tested for feature extraction in Speech recognition system to replace the phoneme based approach of MFCC. Applying the Cepstral coefficients generated to ICA as preprocessing has developed a new signal processing approach. This gives much better results against MFCC and ICA separately, both for word and speaker recognition...

متن کامل

Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques

Journal: :CoRR 2010

Lindasalwa Muda Mumtaj Begam I. Elamvazuthi

Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching...

متن کامل

Wavelet Packet Transform Features with Application to Speaker Identification

1998

Ruhi Sarikaya Bryan L. Pellom John H. L. Hansen

This study proposes a new set of feature parameters based on wavelet packet transform analysis of the speech signal. The new speech features are named subband based cepstral parameters (SBC) and wavelet packet parameters (WPP). The ability of each parameter set to capture speaker identity conveyed in the speech signal is compared to the widely used Mel-frequency cepstral coee-cents (MFCC). The ...

متن کامل

Examining the Influence of Speech Frame Size and Number of Cepstral Coefficients on the Speech Recognition Performance

2007

Iosif Mporas Nikos Fakotakis

In the present work we explore the influence of front-end setup on the speech recognition performance. Specifically, we study the dependence between specific parameters of the speech parameterization stage, such as speech frame size and number of Mel-frequency cepstral coefficients (MFCC), and the word error rate (WER). Our comparative evaluation is performed by employing the Sphinx-IV speech r...

متن کامل

Articulation Rate Filtering of CQCC Features for Automatic Speaker Verification

2016

Massimiliano Todisco Héctor Delgado Nicholas W. D. Evans

This paper introduces a new articulation rate filter and reports its combination with recently proposed constant Q cepstral coefficients (CQCCs) in their first application to automatic speaker verification (ASV). CQCC features are extracted with the constant Q transform (CQT), a perceptually-inspired alternative to Fourier-based approaches to time-frequency analysis. The CQT offers greater freq...

متن کامل

Similarity Measurement for Speaker Identification Using Frequency of Vector Pairs

2014

Inggih Permana Agus Buono Bib Paruhum Silalahi

Similarity measurement is an important part of speaker identification. This study has modified the similarity measurement technique performed in previous studies. Previous studies used the sum of the smallest distance between the input vectors and the codebook vectors of a particular speaker. In this study, the technique has been modified by selecting a particular speaker codebook which has the...

متن کامل

A New Hierarchical Structure for Speech Recognition by units smaller than words, using Wavelet Packet and SVM

2012

Adriano de Andrade Bresolin Adrião Duarte Dória Neto Pablo Javier Alsina

This study proposes using units smaller than words, such as phonemes and syllables, as base units for speech recognition. The system presented here was developed with a hierarchical recognition logic based on the production characteristics of phonemes in Brazilian Portuguese. Decisions are made by Support Vector Machine neural networks grouped to form Specialist Machines. The descriptors used w...

متن کامل

Feature Learning with Gaussian Restricted Boltzmann Machine for Robust Speech Recognition

Journal: :CoRR 2013

Xin Zheng Zhiyong Wu Helen M. Meng Weifeng Li Lianhong Cai

In this paper, we first present a new variant of Gaussian restricted Boltzmann machine (GRBM) called multivariate Gaussian restricted Boltzmann machine (MGRBM), with its definition and learning algorithm. Then we propose using a learned GRBM or MGRBM to extract better features for robust speech recognition. Our experiments on Aurora2 show that both GRBM-extracted and MGRBM-extracted feature per...

متن کامل