coefficient mfcc

مدل سازی غیرخطی قطع پوانکاره سیگنال گفتار در ترکیب با تحلیل حوزه فرکانس به منظور افزایش صحت عملکرد سیستمهای بازشناسی گفتار

ژورنال: :مهندسی برق مدرس 0

ayuob jafari islamic azad university, qazvin branch farshad almasganj amirkabir university of technology maryam nabi bidhendi amirkabir university of technology

در این مقاله روشی جدید برای افزایش صحت سیستمهای بازشناسی گفتار، با استفاده از ترکیب بردارهای ویژگی به دست آمده از مدل سازی غیرخطی فضای فاز بازسازی شده سیگنال گفتار با ویژگیهای معمول به دست آمده از تحلیل حوزه فرکانس ارائه می شود. بر اساس نظریه پذیرفته شده کنونی، در صورت انتخاب تعداد بُعد کافی برای بازسازی فضای فاز سیگنال، این فضا به صورت کامل دینامیک سیستم تولید کننده آن را نشان می دهد و بنابراین...

متن کامل

Scalable distributed speech recognition using Gaussian mixture model-based block quantisation

Journal: :Speech Communication 2006

Stephen So Kuldip K. Paliwal

In this paper, we investigate the use of block quantisers based on Gaussian mixture models (GMMs) for the coding of Mel frequency-warped cepstral coefficient (MFCC) features in distributed speech recognition (DSR) applications. Specifically, we consider the multi-frame scheme, where temporal correlation across MFCC frames is exploited by the Karhunen–Loève transform of the block quantiser. Comp...

متن کامل

Sound Retrieval and Ranking Using Sparse Auditory Representations

Journal: :Neural computation 2010

Richard F. Lyon Martin Rehn Samy Bengio Thomas C. Walters Gal Chechik

To create systems that understand the sounds that humans are exposed to in everyday life, we need to represent sounds with features that can discriminate among many different sound classes. Here, we use a sound-ranking framework to quantitatively evaluate such representations in a large-scale task. We have adapted a machine-vision method, the passive-aggressive model for image retrieval (PAMIR)...

متن کامل

Speaker Independent Continuous Speech to Text Converter for Mobile Application

Journal: :CoRR 2013

R. Sandanalakshmi P. Abinaya Viji M. Kiruthiga M. Manjari M. Sharina

An efficient speech to text converter for mobile application is presented in this work. The prime motive is to formulate a system which would give optimum performance in terms of complexity, accuracy, delay and memory requirements for mobile environment. The speech to text converter consists of two stages namely front-end analysis and patte rn recognition. The front end analysis involves prepro...

متن کامل

Frequency warping for VTLN and speaker adaptation by linear transformation of standard MFCC

Journal: :Computer Speech & Language 2009

Sankaran Panchapagesan Abeer Alwan

Vocal Tract Length Normalization (VTLN) for standard filterbank-based Mel Frequency Cepstral Coefficient (MFCC) features is usually implemented by warping the center frequencies of the Mel filterbank, and the warping factor is estimated using the maximum likelihood score (MLS) criterion (Lee and Rose, 1998). A linear transform (LT) equivalent for frequency warping (FW) would enable more efficie...

متن کامل

Robust algorithms for speech reconstruction on mobile devices

2005

Xu Shao

This thesis is concerned with reconstructing an intelligible time-domain speech signal from speech recognition features, such as Mel-frequency cepstral coefficients (MFCCs), in a distributed speech recognition(DSR) environment. The initial reconstruction methods in this thesis require, in addition to MFCC vectors, fundamental frequency and voicing information. In the later parts of the thesis t...

متن کامل

Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

2017

Ahmed Kamil Hasan Al-Ali Bouchra Senadji Ganesh Naik

We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics o...

متن کامل

Eigen-mllr Coeecients as New Feature Parameters for Speaker Identiication

2001

Nick J.-C. Wang Wei-Ho Tsai Lin-Shan Lee

Eigen-MLLR coe cients are proposed as new feature parameters for speaker-identi cation in this paper. By performing principle component analysis on MLLR parameters among training speakers, the eigen-MLLR coe cients (EMCs) are derived as the coe cients for the eigenvectors. The discriminating function of the new EMC features based on the Fisher criterion is found to be ten times larger than that...

متن کامل

A Novel Robust MFCC Extraction Method Using Sample-ISOMAP for Speech Recognition

2012

Huan Zhao Yufeng Xiao

According to the nonlinear characteristic of the speech signal, this paper presents a novel robust MFCC extraction method using sample-ISOMAP. ISOMAP is a nonlinear dimensionality reduction method based on the theory of manifold, it can reveal the meaningful low-dimensional structure hidden in the high-dimensional observations. In the proposed method, ISOMAP is first applied for calculating the...

متن کامل

Hypothesis-based feature combination of multiple speech inputs for robust speech recognition in automotive environments

2006

Yasunari Obuchi Nobuo Hataoka

In a microphone array system, feature combination in the MFCC domain can improve speech recognition accuracy. Multiple microphones provide different feature parameters such as MFCCs even if they have similar speech and noise signals, because of the phase difference and transmission characteristics. In this paper, we investigate how the recognition performance changes when we average multiple MF...

متن کامل