آنالیز mfcc

Isolated Word Recognition Using MFCC and Vector Quantization

2016

Manish Kumar Sharma

Automatic Speech Recognition (ASR) technology is a way to interface with computer. In this paper we describe speech recognition technique using multiple codebooks of MFCC derived features. The proposed algorithm is useful in detecting isolated words of speech. In this algorithm we first create database i.e. codebook by calculating mel frequency cepstral coefficient first and then codeword for e...

متن کامل

Voice Disorder Classification Based on Multitaper Mel Frequency Cepstral Coefficients Features

2015

Ömer Eskidere Ahmet Gürhanli

The Mel Frequency Cepstral Coefficients (MFCCs) are widely used in order to extract essential information from a voice signal and became a popular feature extractor used in audio processing. However, MFCC features are usually calculated from a single window (taper) characterized by large variance. This study shows investigations on reducing variance for the classification of two different voice...

متن کامل

Isolated Telugu Speech Recognition using MFCC and Gamma tone features by Radial Basis Networks in Noisy Environment

2015

Shaik Shafee

In this paper, Radial basis neural networks[1][12][17] have been examined for speech recognition using speech features MFCC (Mel frequency Coefficients) and Gamma tone frequency coefficients for isolated Telugu words in noisy environment. Speech feature vectors are used to train, validate and test the Radial basis neural networks.Experiments conducted in Office environment under the presence of...

متن کامل

Segmentation of Speech and Humming in Vocal Input

2012

Adam J. SPORKA Ondřej POLÁČEK Jan HAVLÍK

Non-verbal vocal interaction (NVVI) is an interaction method in which sounds other than speech produced by a human are used, such as humming. NVVI complements traditional speech recognition systems with continuous control. In order to combine the two approaches (e.g. “volume up, mmm”) it is necessary to perform a speech/NVVI segmentation of the input sound signal. This paper presents two novel ...

متن کامل

Revising Perceptual Linear Prediction (PLP)

2005

Florian Hönig Georg Stemmer Christian Hacker Fabio Brugnara

Mel Frequency Cepstral Coefficients (MFCC) and Perceptual Linear Prediction (PLP) are the most popular acoustic features used in speech recognition. Often it depends on the task, which of the two methods leads to a better performance. In this work we develop acoustic features that combine the advantages of MFCC and PLP. Based on the observation that the techniques have many similarities, we rev...

متن کامل

Comparison of MPEG-7 basis projection features and MFCC applied to robust speaker recognition

2004

Hyoung-Gook Kim Martin Haller Thomas Sikora

Our purpose is to evaluate the efficiency of MPEG-7 basis projection (BP) features vs. Mel-scale Frequency Cepstrum Coefficients (MFCC) for speaker recognition in noisy environments. The MPEG-7 feature extraction mainly consists of a Normalized Audio Spectrum Envelope (NASE), a basis decomposition algorithm and a spectrum basis projection. Prior to the feature extraction the noise reduction alg...

متن کامل

Spoken Language Identification Using Hybrid Feature Extraction Methods

Journal: :CoRR 2010

Pawan Kumar Astik Biswas A. N. Mishra Mahesh Chandra

This paper introduces and motivates the use of hybrid robust feature extraction technique for spoken language identification (LID) sys tem. The speech recognizers use a parametric form of a signal to get the most important distinguishable features of speech signal for recognition task. In this paper Mel-frequency cepstral coefficients (MFCC), Perceptual linear prediction coefficients (PLP) alon...

متن کامل

Detecting sound events in basketball video archive

2001

Dongqing Zhang

The report proposes a method for detecting the sound events in a basketball game with focusing on detecting cheering sound. MFCC (Mel-frequency cepstral coefficient) features are used to identify the cheering sounds from speeches and other confusing sounds. The mfcc features are fed into a neural network and classified into three classes (cheering, speech, and others). To improve the MFCC-NN pe...

متن کامل

Information fusion for robust speaker verification

2001

Conrad Sanderson Kuldip K. Paliwal

In this paper we have studied two information fusion approaches, namely feature vector concatenation and decision fusion, for the task of reducing error rates in a speaker verification system used in mismatched conditions. Three types of features are fused: Mel Frequency Cepstral Coefficients (MFCC), MFCC with Cepstral Mean Subtraction (CMS) and Maximum Auto-Correlation Values (MACV). We have u...

متن کامل

CNN AND LSTM FOR THE CLASSIFICATION OF PARKINSON'S DISEASE BASED ON THE GTCC AND MFCC

Journal: :Applied Computer Science 2023

Parkinson's disease is a recognizable clinical syndrome with variety of causes and presentations; it represents rapidly growing neurodegenerative disorder. Since about 90 percent sufferers have some form early speech impairment, recent studies on tele diagnosis focused the recognition voice impairments from vowel phonations or subjects' discourse. In this paper, we present new approach for dete...

متن کامل