cepstrum

Artificial Neural Network & Mel-Frequency Cepstrum Coefficients-Based Speaker Recognition

2005

Adjoudj Réda Boukelif Aoued

Speaker recognition is the process of automatically recognizing who is speaking on the basis of individual information included in speech waves. This technique makes it possible to use the speaker’s voice to verify their identity and control access to services such as voice dialing, banking by telephone, telephone shopping, database access services, information services, voice mail, security co...

متن کامل

Speaker Verification System based on the Cerebellum Architecture

2006

Abdul Wahab Mathias Dharmawirya Hiok Chai Quek

The Cerebellar Model Articulation Controller (CMAC) and fuzzy systems have been active areas of research since its initial introduction. Fuzzy CMAC is basically a CMAC that is coupled with a fuzzy system. This paper presents the incorporation of fuzzy systems into CMAC with Approximate Analogical Reasoning Scheme (AARS) as its inference rules and Discrete Incremental Clustering (DIC) as its tec...

متن کامل

Speech Emotion Recognition Using Iterative Clustering Technique

2015

A. REVATHY

This paper proposes a method to recognize the emotion present in the speech signal using Iterative clustering technique. We propose Mel Frequency Perceptual Linear Predictive Cepstrum (MFPLPC) as a feature for recognizing the emotions. This feature is extracted from the speech and the clustering models are generated for each emotion. For the Speaker Independent classification technique, preproc...

متن کامل

A Discrete-cepstrum Based Spectrum-envelope Estimation Scheme and Its Example Applications of Voice Transformation

Journal: :IJCLCLP 2009

Hung-Yan Gu Sung-Feng Tsai

Approximating a spectral envelope via regularized discrete cepstrum coefficients has been proposed by previous researchers. In this paper, we study two problems encountered in practice when adopting this approach to estimate the spectral envelope. The first is which spectral peaks should be selected, and the second is which frequency axis scaling function should be adopted. After some efforts o...

متن کامل

Quantification of glottal and voiced speech harmonics- to-noise ratios using cepstral-based estimation

2005

Peter J. Murphy Olatunji O. Akande

Cepstral analysis is used to estimate the harmonics-to-noise ratio (HNR) in speech signals. The inverse Fourier transformed liftered cepstrum approximates a noise baseline from which the harmonics-to-noise ratio is estimated. The present study highlights the manner in which the cepstrum-based noise baseline estimate is obtained, essentially behaving like a moving average filter applied to the p...

متن کامل

Intelligibility of speech with filtered time trajectories of spectral envelopes

1996

Takayuki Arai Misha Pavel Hynek Hermansky Carlos Avendaño

The effect of filtering the time trajectories of spectral envelopes on speech intelligibility was investigated. Since LPC cepstrum forms the basis of many automatic speech recognition systems, we filtered time trajectories of LPC cepstrum of speech sounds, and the modified speech was reconstructed after the filtering. For processing, we applied low-pass, high-pass and band-pass filters. The res...

متن کامل

A new voice transformation method based on both linear and nonlinear prediction analysis

1996

Ki-Seung Lee Dae Hee Youn Il-Whan Cha

In this paper, we describe a voice transformation meth-od which changes source speaker's acoustic features to those of a target speaker. The method developed here, acoustic features are divided into two parts, linear and nonlinear parts. Linear parts are characterized by LPC cepstrum coe cients which are obtained from LP analysis. As for nonlinear part, which represent the excitation signal, is...

متن کامل

Comparison of MPEG-7 basis projection features and MFCC applied to robust speaker recognition

2004

Hyoung-Gook Kim Martin Haller Thomas Sikora

Our purpose is to evaluate the efficiency of MPEG-7 basis projection (BP) features vs. Mel-scale Frequency Cepstrum Coefficients (MFCC) for speaker recognition in noisy environments. The MPEG-7 feature extraction mainly consists of a Normalized Audio Spectrum Envelope (NASE), a basis decomposition algorithm and a spectrum basis projection. Prior to the feature extraction the noise reduction alg...

متن کامل

Joint optimization of multiple neural codebooks in a hybrid connectionist-HMM speech recognition system

1993

Gerhard Rigoll

This paper proposes a new approach for a hybrid connectionistHMM speech recognition system. The system consists of a multi-feature HMM-based recognition module using three different neural networks as multiple neural codebooks. Each neural network receives a different feature (i.e. cepstrum, delta cepstrum, and delta power) as input and generates a vector quantizer label obtained from the firin...

متن کامل

Nonlinear Discriminant Feature Extraction for Robust Text-independent Speaker Recognition

1998

Yochai Konig Larry Heck Mitch Weintraub Kemal Sonmez

We study a nonlinear discriminant analysis (NLDA) technique that extracts a speaker-discriminant feature set. Our approach is to train a multilayer perceptron (MLP) to maximize the separation between speakers by nonlinearly projecting a large set of acoustic features (e.g., several frames) to a lower-dimensional feature set. The extracted features are optimized to discriminate between speakers ...

متن کامل