نتایج جستجو برای: mel frequency cepstral coefficient mfcc

تعداد نتایج: 644930  

2002
James H. Nealand Alan B. Bradley

Speaker Recognition is the task of identifying an individual from their voice. Typically this task is performed in two consecutive stages: feature extraction and classification. Using a Gaussian Mixture Model (GMM) classifier different filter-bank configurations were compared as feature extraction techniques for speaker recognition. The filter-banks were also compared to the popular Mel-Frequen...

2004
Jonathan Darch Ben Milner Xu Shao

This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCC vectors and formant vectors using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method pred...

2013
Tomyslav Sledevič Artūras Serackis Gintautas Tamulevičius Dalius Navakauskas

Paper presents an comparative evaluation of features extraction algorithm for a real-time isolated word recognition system based on FPGA. The Mel-frequency cepstral, linear frequency cepstral, linear predictive and their cepstral coefficients were implemented in hardware/software design. The proposed system was investigated in speaker dependent mode for 100 different Lithuanian words. The robus...

2003
Britta Wrede

The present investigation analyses the behaviour of the first order derivatives of the log-mel-spectrum of vowels which constitutes the basis for the mel-frequency cepstral coefficients (MFCC). The results indicate that the dynamic features when inspected at log-mel-spectra level seem to be less influenced by speaker specific characteristics and degrade less in fast speech. However, when analys...

2017
Muhammad Asim Ali Zain Ahmed Siddiqui

Classification of music genre has been an inspiring job in the area of music information retrieval (MIR). Classification of genre can be valuable to explain some actual interesting problems such as creating song references, finding related songs, finding societies who will like that specific song. The purpose of our research is to find best machine learning algorithm that predict the genre of s...

2013
R. Visalakshi

In this paper, we have analyzed the performance of speaker recognition system based on features extracted from the speech recorded using throat microphone in clean and noisy environment. In general, clean speech performs better for speaker recognition system. Speaker recognition in noisy environment, using transducer held at the throat results in a signal that is clean even in noisy. This speak...

Journal: :I. J. Speech Technology 2013
Biswajit Das Sandipan Mandal Pabitra Mitra Anupam Basu

The article studies age related variations of speech characteristics of two age groups, in the Bengali language. The study considers 60 speakers in the each age groups, 60– 80 years and 20–40 years, respectively. We have considered different voice source features like fundamental frequency, formant frequencies, jitter, shimmer and harmonic to noise ratio. Cepstral domain feature, Mel Frequency ...

2014
Md. Jahangir Alam Patrick Kenny Pierre Dumouchel Douglas D. O'Shaughnessy

This work presents a noise spectrum estimator based on the Gaussian mixture model (GMM)-based speech presence probability (SPP) for robust speech recognition. Estimated noise spectrum is then used to compute a subband a posteriori signal-to-noise ratio (SNR). A sigmoid shape weighting rule is formed based on this subband a posteriori SNR to enhance the speech spectrum in the auditory domain, wh...

2013
Md. Jahangir Alam Yazid Attabi Pierre Dumouchel Patrick Kenny Douglas D. O'Shaughnessy

The goal of speech emotion recognition (SER) is to identify the emotional or physical state of a human being from his or her voice. One of the most important things in a SER task is to extract and select relevant speech features with which most emotions could be recognized. In this paper, we present a smoothed nonlinear energy operator (SNEO)-based amplitude modulation cepstral coefficients (AM...

2011
Huan Zhao He Liu Kai Zhao Yong Yang

The performance of traditional mel-frequency cepstral coefficients (MFCC) speech feature extraction method decreases drastically in the complex noisy environment. To improve the performance and robustness of speech recognition system, which is based on spectral envelope estimation method, the minimum distortionless response spectrum MVDR-MFCC (Minimum Variance Distortionless Response-MFCC) feat...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید