speech feature extraction

Acoustic and Data-driven Features for Robust Speech Activity Detection

2012

Samuel Thomas Sri Harish Reddy Mallidi Thomas Janu Hynek Hermansky Nima Mesgarani Xinhui Zhou Shihab A. Shamma Tim Ng Bing Zhang Long Nguyen Spyridon Matsoukas

In this paper we evaluate different features for speech activity detection (SAD). Several signal processing techniques are used to derive acoustic features that capture attributes of speech useful in differentiating speech segments in noise. The acoustic features include short-term spectral features, long-term modulation features both derived using Frequency Domain Linear Prediction (FDLP), and...

متن کامل

Extended weighted linear prediction using the autocorrelation snapshot - a robust speech analysis method and its application to recognition of vocal emotions

2013

Jouni Pohjalainen Paavo Alku

Temporally weighted linear predictive methods have recently been successfully used for robust feature extraction in speech and speaker recognition. This paper introduces their general formulation, where various efficient temporal weighting functions can be included in the optimization of the all-pole coefficients of a linear predictive model. Temporal weighting is imposed by multiplying element...

متن کامل

Robust feature extraction using subband spectral centroid histograms

2001

Bojana Gajic Kuldip K. Paliwal

In this paper we propose a new framework for utilizing frequency information from the short-term power spectrum of speech. Feature extraction is based on the cepstral coefficients derived from the histograms of subband spectral centroids (SSC). Two new feature extraction algorithms are proposed, one based on frequency information alone, and the other which efficiently combines the frequency and...

متن کامل

Robust distributed speech recognition in noise and packet loss conditions

Journal: :Digital Signal Processing 2010

Ronan Flynn Edward Jones

a r t i c l e i n f o a b s t r a c t This paper examines the performance of a Distributed Speech Recognition (DSR) system in the presence of both background noise and packet loss. Recognition performance is examined for feature vectors extracted from speech using a physiologically-based auditory model, as an alternative to the more commonly-used Mel Frequency Cepstral Coefficient (MFCC) front-...

متن کامل

Speaker Recognition Using DWT- MFCC with Multi-SVM Classifier

2017

This paper describes a hybrid technique for speaker recognition. Speaker recognition is that the method of identifying the person based on characteristics like pitch, tone, Cepstral coefficients in the speech wave. Here DWT and MFCC technique is employed for feature extraction. A mix of two or lot of techniques is named hybrid technique. DWT means divide the speech signal completely into differ...

متن کامل

Speaker Dependent Word Recognition Using MFCC and VQ

2004

Nitin N Lokhande Chandrakant Kadu

The paper present effective method for recognition of digit, numbers. Most of speech recognition systems contain two main modules as follow “feature extraction” and “feature matching”. In this project, (MFCC) Mel Frequency Cepstrum coefficient algorithm is used to simulate feature extraction module. Using this algorithm, the Cepstral Coefficients are calculated on Mel frequency scale. VQ (vecto...

متن کامل

HMM based Automatic Speech Recognition Analysis

2015

Preeti Saini

This project's 'HMM Based Automatic Speech Recognition Analysis main motive is just to generate an Automatic speech recognition which is clear an accurate using Hidden Markov Model (HMM) to get accurate results at number of frequency ranges related to human voice. Here is a record of 12 different words which is recorded by using a number of different speakers that includes male and female both ...

متن کامل

Spoken Language Identification Using Hybrid Feature Extraction Methods

Journal: :CoRR 2010

Pawan Kumar Astik Biswas A. N. Mishra Mahesh Chandra

This paper introduces and motivates the use of hybrid robust feature extraction technique for spoken language identification (LID) sys tem. The speech recognizers use a parametric form of a signal to get the most important distinguishable features of speech signal for recognition task. In this paper Mel-frequency cepstral coefficients (MFCC), Perceptual linear prediction coefficients (PLP) alon...

متن کامل

Speech Emotion Classification using Machine Learning

2015

Pooja Yadav Gaurav Aggarwal

In recent years, the interaction between humans and machines has become an issue of concern. This paper results from study of various researches related to the investigation of the six basic human emotions which include anger, dislike, fear, happiness, sadness and surprise. [1, 3] Feature extraction is done from various voice utterances recorded from different persons. The various features like...

متن کامل

端點偵測技術在強健語音參數擷取之研究 (Study of the Voice Activity Detection Techniques for Robust Speech Feature Extraction) [In Chinese]

2007

Wen-Hsiang Tu Jeih-Weih Hung

The performance of a speech recognition system is often degraded due to the mismatch between the environments of development and application. One of the major sources that give rises to this mismatch is additive noise. The approaches for handling the problem of additive noise can be divided into three classes: speech enhancement, robust speech feature extraction, and compensation of speech mode...

متن کامل