Perceptual speech coding using time and frequency masking constraints

نویسندگان

Benito Carnero

Andrzej Drygajlo

چکیده

This paper presents a new wide-band speech coding system based on a fast wavelet packet transform algorithm as well as a formulation of temporal and spectral psychoacoustic models of masking. The proposed FFT-like overlapped block orthogonal transform allows us to approximate the auditory critical band decomposition in an e cient manner, which is a major advantage over previous approaches that used uniform lter banks. As a result of such a decomposition, the perceptually tuned time-frequency structure of the original speech signal is preserved. This allows us to make use of the temporal and spectral properties of the human auditory system to decrease the average bit rate of the encoder, while perceptually hiding the quantization error.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrated speech enhancement and coding in the time-frequency domain

This paper addresses the problem of merging speech enhancement and coding in the context of an auditory modeling. The noisy signal is rst processed by a fast wavelet packet transform algorithm to obtain an auditory spectrum, from which a rough masking model is estimated. Then, this model is used to re ne a subtractive-type enhancement algorithm. The enhanced speech coe cients are then encoded i...

متن کامل

A warped linear-prediction-based subband audio coding algorithm

In this paper, a novel audio coding algorithm is proposed where the warped linear prediction (WLP) technique is employed to construct a perceptual preand post-filter for subband audio coding. A modified signal-to-mask ratio (SMR) calculation is given for subband coding of the WLP residuals of audio signals. The concept of perceptual entropy (PE) is extended to subband coding, resulting in the s...

متن کامل

Time-frequency masking for speech separation and its potential for hearing aid design.

A new approach to the separation of speech from speech-in-noise mixtures is the use of time-frequency (T-F) masking. Originated in the field of computational auditory scene analysis, T-F masking performs separation in the time-frequency domain. This article introduces the T-F masking concept and reviews T-F masking algorithms that separate target speech from either monaural or binaural mixtures...

متن کامل

Comparison of auditory masking models for speech coding

In this paper various auditory masking models recently developed for audio coding are compared and evaluated for telephone bandwidth speech coding applications. Four such models are outlined and their performance evaluated using a Wavelet Packet Transform based subband coder. The models are compared on the basis of the resulting perceptual speech quality and bit rate requirements. Results show ...

متن کامل

Perceptual irrelevancy removal in narrowband speech coding

A masking model originally designed for audio signals is applied to narrowband speech. The model is used to detect and remove the perceptually irrelevant simultaneously masked frequency components of a speech signal. Objective measurements have shown that the modified speech signal can be coded more efficiently than the original signal. Furthermore, it has been confirmed through perceptual eval...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

Perceptual speech coding using time and frequency masking constraints

نویسندگان

چکیده

منابع مشابه

Integrated speech enhancement and coding in the time-frequency domain

A warped linear-prediction-based subband audio coding algorithm

Time-frequency masking for speech separation and its potential for hearing aid design.

Comparison of auditory masking models for speech coding

Perceptual irrelevancy removal in narrowband speech coding

عنوان ژورنال:

اشتراک گذاری