Distance Measures for Wavelet Representation of Speech Segments

نویسنده

  • Jakub Gałka
چکیده

Dyadic scheme of wavelet signal decomposition leads to a specific division of frequency bands. It is comparable to mel-frequency division and may be used in effective parameterization of speech signal in recognition systems, speech coding or other speech signal based applications. This paper discusses efficiency of different spectral distance measures applied to wavelet-parameterized speech. The presented methods are designated to use in isolated phoneme recognition task. INTRODUCTION Human ear is sensitive to frequency attributes of audio signals, including speech. Frequency resolution of human’s perception is not the same for all frequencies. In low frequency bands, slightly different tones are much more distinguishable then they would be in higher frequency bands. In the effect, frequency characteristics of human’s ear are not linear but rather logarithmic. Experiments showed that human aural system perceive frequencies not linearly but according to a mel-frequency scale [9] ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + ⋅ = 700 1 1127 Hz e mel f f log . (1) Many practical works confirmed usefulness of this scale in speech recognition systems, speech coding or compression. Most frequently used type of speech parameterization that support melfrequency scale is MFCC (mel-frequency cepstral coefficients) [6]. Our proposition provides logarithmic frequency scale division, typical for dyadic discrete wavelet decomposition scheme, and may be an interesting alternative approach. It is necessary to adjust known, or develop new computational tools for use with this type of parameterization in speech recognition applications. As far as wavelet parameterization scheme, has properties similar to linear prediction coefficients or mel-frequency cepstral coefficients, the known computational tools may be applied. In this paper different spectral distortion measures were applied to wavelet parameterization scheme in order to determine their efficiency. WAVELET SPEECH SIGNAL PARAMETERISATION Continuous-time wavelet transform is defined as

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Speech Encryption Using Wavelet Packets

The aim of speech scrambling algorithms is to transform clear speech into an unintelligible signal so that it is difficult to decrypt it in the absence of the key. Most of the existing speech scrambling algorithms tend to retain considerable residual intelligibility in the scrambled speech and are easy to break. Typically, a speech scrambling algorithm involves permutation of speech segments in...

متن کامل

Intelligent Voice Recognition System Based on Acoustic and Speaking Fundamental Frequency Characteristics

Speech recognition is a fascinating application of Digital Signal Processing and has many real-world applications. In this paper, a speech recognition system is developed for isolated spoken words using Discrete Wavelet Transforms (DWT) and Artificial Neural Networks (ANN). Speech signals are one-dimensional and are random in nature. This paper investigates Automatic Speech Recognition of gende...

متن کامل

‎On the two-wavelet localization operators on homogeneous spaces with relatively invariant measures

In ‎the present ‎paper, ‎we ‎introduce the ‎two-wavelet ‎localization ‎operator ‎for ‎the square ‎integrable ‎representation ‎of a‎ ‎homogeneous space‎ with respect to a relatively invariant measure. ‎We show that it is a bounded linear operator. We investigate ‎some ‎properties ‎of the ‎two-wavelet ‎localization ‎operator ‎and ‎show ‎that ‎it ‎is a‎ ‎compact ‎operator ‎and is ‎contained ‎in‎ a...

متن کامل

Adult Voice Recognition System using Text Variable Phoneme Model and Coarse Speaking Fundamental Frequency Characteristics

-------------------------------------------------------Abstract--------------------------------------------------------Speech recognition is a fascinating application of Digital Signal Processing and has many real-world applications. In this paper, a speech recognition system is developed for isolated spoken words using Discrete Wavelet Transforms (DWT) and Artificial Neural Networks (ANN). Spe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006