آنالیز mfcc

Integrating the energy information into MFCC

2000

Fang Zheng Guoliang Zhang

The Mel-Frequency Cepstrum Coefficients (MFCC) is a widely used set of feature used in automatic speech recognition systems introduced in 1980 by Davis and Mermelstein [2]. In this traditional implementation, the 0 coefficient is excluded for the reason it is somewhat unreliable. In this paper, we analyze this term and find that it can be regarded as the generalized frequency band energy (FBE) ...

متن کامل

Enhanced Performance of Search Engine with Multitype Feature Co-selection of Fuzzy K-Means Clustering Algorithm

2013

K. Parimala V. Palanisamy

Information world meet many confronts nowadays and one such, is data retrieval from a multidimensional and heterogeneous data set. Han & et al carried out a trail for the mentioned challenge. A novel feature co-selection for Web document clustering is proposed by them, which is called Multitype Features Co-selection for Clustering (MFCC). MFCC uses intermediate clustering results in one type of...

متن کامل

Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model

2002

Ben P. Milner Xu Shao

This work presents a method of reconstructing a speech signal from a stream of MFCC vectors using a source-filter model of speech production. The MFCC vectors are used to provide an estimate of the vocal tract filter. This is achieved by inverting the MFCC vector back to a smoothed estimate of the magnitude spectrum. The Wiener-Khintchine theorem and linear predictive analysis transform this in...

متن کامل

Auditory spectrum based features (ASBF) for robust speech recognition

2000

Chi H. Yim Oscar C. Au Wanggen Wan Cyan L. Keung Carrson C. Fung

MFCC are features commonly used in speech recognition systems today. The recognition accuracy of systems using MFCC is known to be high in clean speech environment, but it drops greatly in noisy environment. In this paper, we propose new features called the auditory spectrum based features (ASBF) that are based on the cochlear model of the human auditory system. These new features can track the...

متن کامل

LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization

2011

Sree Hari Krishnan Parthasarathi Hervé Bourlard Daniel Gatica-Perez

We present a comprehensive study of linear prediction residual for speaker diarization on single and multiple distant microphone conditions in privacy-sensitive settings, a requirement to analyze a wide range of spontaneous conversations. Two representations of the residual are compared, namely real-cepstrum and MFCC, with the latter performing better. Experiments on RT06eval show that residual...

متن کامل

An Enhanced Speech Recognition System

2014

Suma Shankaranand Mani Sharma K. V. Ramakrishnan

This paper describes the development of an efficient speech recognition system using various techniques such as Mel Frequency Cepstrum Coefficients (MFCC), Vector Quantization (VQ), Hidden Markov Model (HMM) and Autocorrelation. In this paper, a method to recognize the speech faster with more accuracy, speaker recognition is followed by speech recognition. MFCC/Autocorrelation is used to extrac...

متن کامل

Effectiveness of KL-transformation in spectral delta expansion

1999

M. Tokuhira Yasuo Ariki

MFCC is widely used together with its delta and delta-delta features in the field of speech recognition based on HMM. MFCC is designed to apply DCT to the MF output. We propose in this paper to employ KL transformation instead of DCT, because it can reflect the statistics of speech data more precisely. MFCC is the compressed feature of the log MF so that some detailed features seem to be lost. ...

متن کامل

Feature extraction using Mel frequency cepstral coefficients for hyperspectral image classification

2010

Delian Liu Xiaorui Wang Jianqi Zhang Xi Huang

The Mel frequency cepstral coefficient (MFCC) model, which is widely used in speech detection and recognition, is introduced to extract features from hyperspectral image data. The similarities and differences between speech signals and spectral image data are compared and analyzed. The standard MFCC model is then improved to suit the characteristics of spectral image data by reintroducing the d...

متن کامل

Unsupervised speaker segmentation with residual phase and MFCC features

Journal: :Expert Syst. Appl. 2009

S. Jothilakshmi Vennila Ramalingam S. Palanivel

This paper proposes an unsupervised method for improving the automatic speaker segmentation performance by combining the evidence from residual phase (RP) and mel frequency cepstral coefficients (MFCC). This method demonstrates the complementary nature of speaker specific information present in the residual phase in comparison with the information present in the conventional MFCC. Moreover this...

متن کامل

MAP prediction of pitch from MFCC vectors for speech reconstruction

2004

Xu Shao Ben P. Milner

This work proposes a method of predicting pitch and voicing from mel-frequency cepstral coefficient (MFCC) vectors. Two maximum a posteriori (MAP) methods are considered. The first models the joint distribution of the MFCC vector and pitch using a Gaussian mixture model (GMM) while the second method also models the temporal correlation of the pitch contour using a combined hidden Markov model (...

متن کامل