Perceptual speech coding using time and frequency masking constraints
نویسندگان
چکیده
This paper presents a new wide-band speech coding system based on a fast wavelet packet transform algorithm as well as a formulation of temporal and spectral psychoacoustic models of masking. The proposed FFT-like overlapped block orthogonal transform allows us to approximate the auditory critical band decomposition in an e cient manner, which is a major advantage over previous approaches that used uniform lter banks. As a result of such a decomposition, the perceptually tuned time-frequency structure of the original speech signal is preserved. This allows us to make use of the temporal and spectral properties of the human auditory system to decrease the average bit rate of the encoder, while perceptually hiding the quantization error.
منابع مشابه
Integrated speech enhancement and coding in the time-frequency domain
This paper addresses the problem of merging speech enhancement and coding in the context of an auditory modeling. The noisy signal is rst processed by a fast wavelet packet transform algorithm to obtain an auditory spectrum, from which a rough masking model is estimated. Then, this model is used to re ne a subtractive-type enhancement algorithm. The enhanced speech coe cients are then encoded i...
متن کاملA warped linear-prediction-based subband audio coding algorithm
In this paper, a novel audio coding algorithm is proposed where the warped linear prediction (WLP) technique is employed to construct a perceptual preand post-filter for subband audio coding. A modified signal-to-mask ratio (SMR) calculation is given for subband coding of the WLP residuals of audio signals. The concept of perceptual entropy (PE) is extended to subband coding, resulting in the s...
متن کاملTime-frequency masking for speech separation and its potential for hearing aid design.
A new approach to the separation of speech from speech-in-noise mixtures is the use of time-frequency (T-F) masking. Originated in the field of computational auditory scene analysis, T-F masking performs separation in the time-frequency domain. This article introduces the T-F masking concept and reviews T-F masking algorithms that separate target speech from either monaural or binaural mixtures...
متن کاملComparison of auditory masking models for speech coding
In this paper various auditory masking models recently developed for audio coding are compared and evaluated for telephone bandwidth speech coding applications. Four such models are outlined and their performance evaluated using a Wavelet Packet Transform based subband coder. The models are compared on the basis of the resulting perceptual speech quality and bit rate requirements. Results show ...
متن کاملPerceptual irrelevancy removal in narrowband speech coding
A masking model originally designed for audio signals is applied to narrowband speech. The model is used to detect and remove the perceptually irrelevant simultaneously masked frequency components of a speech signal. Objective measurements have shown that the modified speech signal can be coded more efficiently than the original signal. Furthermore, it has been confirmed through perceptual eval...
متن کامل