نتایج جستجو برای: spectro temporal features

تعداد نتایج: 749040  

2008
Tiago H. Falk Wai-Yip Chan

Features derived from an auditory spectro-temporal representation of speech are proposed for robust far-field speaker identification. The auditory representation is obtained by first filtering the speech signal with a gammatone filterbank. A modulation filterbank is then applied to the temporal envelope of each gammatone filter output. Compared to commonly used mel-frequency cepstral coefficien...

2015
Souli Sameh Zied Lachiri

The paper presents the task of recognizing environmental sounds for audio surveillance and security applications. A various characteristics have been proposed for audio classification, including the popular Mel-frequency cepstral coefficients (MFCCs) which give a description of the audio spectral shape. However, it exist some temporal-domain features. These last have been developed to character...

2011
Martin Heckmann Xavier Domont Frank Joublin Christian Goerick

Most common speech features as Mel Ceptstral Coefficients (MFCCs) and RelAtive SpecTrAl Perceptual Linear Predictive RASTA-PLP features use only spectral information. However, from measurements in the mammalian auditory cortex it is known that the mammalian brain jointly uses spectral and temporal information. To model this we previously developed Hierarchical SpectroTemporal (HIST) features [1...

Journal: :Pattern Recognition Letters 2005
Yong-Choon Cho Seungjin Choi

A parts-based representation is a way of understanding object recognition in the brain. The nonnegative matrix factorization (NMF) is an algorithm which is able to learn a parts-based representation by allowing only non-subtractive combinations (Lee and Seung, 1999). In this paper we incorporate a parts-based representation of spectro-temporal sounds into the acoustic feature extraction, which ...

2004
Shajith Ikbal Mathew Magimai-Doss Hemant Misra Hervé Bourlard

In this paper, we introduce a new noise robust representation of speech signal obtained by locating points of potential importance in the spectrogram, and parameterizing the activity of time-frequency pattern around those points. These features are referred to as Spectro-Temporal Activity Pattern (STAP) features. The suitability of these features for noise robust speech recognition is examined ...

2017
Runnan Li Zhiyong Wu Yishuang Ning Lifa Sun Helen M. Meng Lianhong Cai

From speech, speaker identity can be mostly characterized by the spectro-temporal structures of spectrum. Although recent researches have demonstrated the effectiveness of employing long short-term memory (LSTM) recurrent neural network (RNN) in voice conversion, traditional LSTM-RNN based approaches usually focus on temporal evolutions of speech features only. In this paper, we improve the con...

2011
Stuart Rosen Richard J. S. Wise Shabneet Chadha Eleanor-Jayne Conway Sophie K. Scott

BACKGROUND The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding 'rapid temporal processing'. METHODOLOGY A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulat...

2012
Marc René Schädler Birger Kollmeier

Physiologically motivated feature extraction methods based on 2D-Gabor filters have already been used successfully in robust automatic speech recognition (ASR) systems. Recently it was shown that a Mel Frequency Cepstral Coefficients (MFCC) baseline can be improved with physiologically motivated features extracted by a 2D-Gabor filter bank (GBFB). Besides physiologically inspired approaches to ...

Journal: :The Journal of the Acoustical Society of America 2021

Journal: :The Journal of the Acoustical Society of America 2012
Marc René Schädler Bernd T Meyer Birger Kollmeier

In an attempt to increase the robustness of automatic speech recognition (ASR) systems, a feature extraction scheme is proposed that takes spectro-temporal modulation frequencies (MF) into account. This physiologically inspired approach uses a two-dimensional filter bank based on Gabor filters, which limits the redundant information between feature components, and also results in physically int...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید