نتایج جستجو برای: cepstral

تعداد نتایج: 2662  

2009
K. GOPALAN TAO CHU XIAOFENG MIAO

This paper describes the preliminary results of a keyword spotting system using a fusion of spectral and cepstral features. Spectral energy in 16 bands of frequencies on Bark scale and 16 mel-scale warped cepstral coefficients are used independently and in combination with appropriate weights for recognizing word utterances. Results of matching features using Euclidean and cosine distances in a...

2011
Luís Almeida Paulo Menezes Jorge Dias

Vergence ability is an important visual behavior observed on living creatures when they use vision to interact with the environment. The notion of active observer is equally useful for robotic vision systems on tasks like object tracking, fixation and 3D environment structure recovery. Humanoid robotics are a potential playground for such behaviors. This paper describes the implementation of a ...

2007
Chang-Wen Hsu Lin-Shan Lee

Cepstral normalization has been popularly used as a powerful approach to produce robust features for speech recognition. A new approach of Powered Cepstral Normalization (P-CN) was recently proposed to normalize the MFCC parameters in the r1-th order powered domain, where r1 > 1.0, and then transform the features back by an 1/r2 power order to a better recognition domain, and it was shown to pr...

1998
John W. McDonough Alan V. Oppenheim Philip E. Gill Walter Murray John R. Deller John G. Proakis

Speaker normalization is a process in which the short-time features of speech from a given speaker are transformed so as to better match some speaker independent model. Vocal tract length normalization (VTLN) is a popular speaker normalization scheme wherein the frequency axis of the short-time spectrum associated with a particular speaker’s speech is rescaled or warped prior to the extraction ...

2008
Vikramjit Mitra Daniel Garcia-Romero Carol Y. Espy-Wilson

This paper presents an audio genre detection framework that can be used for a multi-language audio corpus. Cepstral coefficients are considered and analyzed as the feature set for both a language dependent and language independent genre identification (GID) task. Language information is found to increase the overall detection accuracy on an average by at least 2.6% from its language independent...

2008
Namunu Chinthaka Maddage Haizhou Li

Sung language recognition relies on both effective feature extraction and acoustic modeling. In this paper, we study rhythm based music segmentation in which the frame size varies in proportion to inter-beat interval of the music, in contrast to fixed length segmentation (FIX) in spoken language recognition. We show that acoustic feature extracted from the BSS scheme outperforms that from FIX. ...

2007
Todd A. Stephenson

Artiicial neural networks (ANNs) have been used to classify phonetic features in speech. The feature streams from the ANNs are used here as the observations for Hidden Markov Models (HMMs). Using such observations allows us to build a competitive speech recogniser. This recogniser is compared to a similar recogniser that was trained on mel-frequency cepstral coeecients (MFCCs). While the cepstr...

1997
Simon Dobrisek France Mihelic Nikola Pavesic

This paper presents an effort to provide a more efficient speech signal representation, which aims to be incorporated into an automatic speech recognition system. Modified cepstral coefficients, derived from a multiresolution auditory spectrum are proposed. The multiresolution spectrum was obtained using sliding single point discrete Fourier transformations. It is shown that the obtained spectr...

2007
José Ramón Calvo de Lara Rafael Fernández Gabriel Hernández

Recently, Shifted Delta Cepstral (SDC) feature was reported to produce superior performance to the delta and delta-delta features in cepstral feature based language identification (LID) systems [1, 2]. This paper examines the application of SDC features in speaker verification and evaluates its robustness to channel mismatch, manner of speaking and session variability. The result of the experim...

2015
Ashish Panda

This paper addresses the problem of speaker verification in the presence of additive noise. We propose a fast implementation of Psychoacoustic Model Compensation (Psy-Comp) scheme for static features along with model domain mean and variance normalization for robust speaker recognition in noisy conditions. The proposed algorithms are validated through experiments on noise corrupted NIST-2000 sp...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید