Comparison of time & frequency filtering and cepstral-time matrix approaches in ASR
نویسندگان
چکیده
In current speech recognition systems, speech is represented by a 2-D sequence of parameters that model the temporal evolution of the spectral envelope of speech. Linear transformation or filtering along both time and frequency axes of that 2-D sequence are used to enhance the discriminative ability and robustness of speech parameters in the HMM pattern-matching formalism. In this paper, we compared two recently reported approaches which operate on the sequence of logarithmically compressed mel-scaled filter-bank energies: the first approach TIFFING (TIme and Frequency FilterING) applies FIR filters to that 2-D sequence along both axes, while the second one CTM (Cepstral Time Matrix) uses the DCT to compute a set of parameters in the 2-D transformed domain. They are compared in several ways: (1) analytically, using Fourier transformation, (2) statistically and (3) performing recognition tests with clean and noisy speech.
منابع مشابه
Application of Single-Frequency Time-Space Filtering Technique for Seismic Ground Roll and Random Noise Attenuation
Time-frequency filtering is an acceptable technique for attenuating noise in 2-D (time-space) and 3-D (time-space-space) reflection seismic data. The common approach for this purpose is transforming each seismic signal from 1-D time domain to a 2-D time-frequency domain and then denoising the signal by a designed filter and finally transforming back the filtered signal to original time domain. ...
متن کاملFeature Extraction and Classification for Automatic Speaker Recognition System – A Review
Automatic speaker recognition (ASR) has found immense applications in the industries like banking, security, forensics etc. for its advantages such as easy implementation, more secure, more user friendly. To have a good recognition rate is a pre-requisite for any ASR system which can be achieved by making an optimal choice among the available techniques for ASR. In this paper, different techniq...
متن کاملTime-frequency principal components of speech: application to speaker identification
In this paper, we propose a formalism, called vector filtering of spectral trajectories, which allows to integrate under a common formalism a lot of speech parameterization approaches. We then propose a new filtering in this framework, called time-frequency principal components (TFPC) of speech. We apply this new filtering in the framework of speaker identification, using a subset of the POLYCO...
متن کاملAnalysis of Planar Microstrip Circuits Using Three-Dimensional Transmission Line Matrix Method
The frequency-dependent characteristics of microstrip planar circuits have been previously analyzed using several full-wave approaches. All those methods directly give characteristic of the circuits frequency by frequency. Computation time becomes important if these planar circuits have to be studied over a very large bandwidth. The transmission line matrix (TLM) method presented in this paper ...
متن کاملReal Time Speech Recognition Using DSK TMS320C6713
Speech recognition is an important field of digital signal processing. Automatic Speaker Recognition (ASR) objective is to extract features, characterize and recognize speaker. Mel Frequency Cepstral Coefficients (MFCC) is most widely used feature vector for ASR. MFCC is used for designing a text dependent speaker identification system. In this paper the DSP processor TMS320C6713 with Code Comp...
متن کامل