Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

نویسندگان

Christian Feldbauer

Gernot Kubin

W. Bastiaan Kleijn

چکیده

Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel) coding.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Gammatone-based Psychoacoustical Modeling Approach for Speech and Audio Coding

We propose a new approach for modeling auditory masking based on gammatone filters for application areas including speech/audio coding and audio watermarking. Besides the use of gammatone filters, this model differs from existing audio coding psychoacoustical models (e.g., the ones used in MPEG), in taking into account the contribution of a range of filters in computing the distortion, rather t...

متن کامل

Articulatory synthesis from x-rays and inversion for an adaptive speech robot

This paper describes a speech robotic approach to articulatory synthesis. An anthropomorphic speech robot has been built, based on a real reference subject’s data. This speech robot, called the Articulotron, has a set of relevant degrees of freedom for speech articulators, jaw, tongue, lips, and larynx. The associated articulatory model has been elaborated from cineradiographic midsagittal prof...

متن کامل

Gaussian Mixture Model Based Coding of Speech and Audio

The transmission of speech and audio over communication channels has always required speech and audio coders with reasonable search and computational complexity and good performance relative to the corresponding distortion measure. This work introduces a coding scheme which works in a perceptual auditory domain. The input high dimensional frames of audio and speech are transformed to power spec...

متن کامل

Anthropomorphic Agent as an Integrating Platform of Audio-Visual Information

One of ultimate human-machine interfaces is anthropomorphic spoken dialog agent which behaves like humans with facial animation and gesture and make speech conversations with humans. Among numerous efforts devoted for such a goal, Galatea Project conducted by 17 members from 12 universities is developing an open-source license-free software toolkit [1] for building an anthropomorphic spoken dia...

متن کامل

Cipher text only attack on speech time scrambling systems using correction of audio spectrogram

Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

EURASIP J. Adv. Sig. Proc.

دوره 2005 شماره

صفحات -

تاریخ انتشار 2005

Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

نویسندگان

چکیده

منابع مشابه

A Gammatone-based Psychoacoustical Modeling Approach for Speech and Audio Coding

Articulatory synthesis from x-rays and inversion for an adaptive speech robot

Gaussian Mixture Model Based Coding of Speech and Audio

Anthropomorphic Agent as an Integrating Platform of Audio-Visual Information

Cipher text only attack on speech time scrambling systems using correction of audio spectrogram

عنوان ژورنال:

اشتراک گذاری