Binaural cue coding-Part I: psychoacoustic fundamentals and design principles

نویسندگان

  • Frank Baumgarte
  • Christof Faller
چکیده

Binaural Cue Coding (BCC) is a method for multichannel spatial rendering based on one down-mixed audio channel and BCC side information. The BCC side information has a low data rate and it is derived from the multichannel encoder input signal. A natural application of BCC is multichannel audio data rate reduction since only a single down-mixed audio channel needs to be transmitted. An alternative BCC scheme for efficient joint transmission of independent source signals supports flexible spatial rendering at the decoder. This paper (Part I) discusses the most relevant binaural perception phenomena exploited by BCC. Based on that, it presents a psychoacoustically motivated approach for designing a BCC analyzer and synthesizer. This leads to a reference implementation for analysis and synthesis of stereophonic audio signals based on a Cochlear Filter Bank. BCC synthesizer implementations based on the FFT are presented as low-complexity alternatives. A subjective audio quality assessment of these implementations shows the robust performance of BCC for critical speech and audio material. Moreover, the results suggest that the performance given by the reference synthesizer is not significantly compromised when using a low-complexity FFT-based synthesizer. The companion paper (Part II) generalizes BCC analysis and synthesis for multichannel audio and proposes complete BCC schemes including quantization and coding. Part II also describes an alternative BCC scheme with flexible rendering capability at the decoder and proposes several applications for both BCC schemes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Binaural cue coding-Part II: Schemes and applications

Binaural Cue Coding (BCC) is a method for multichannel spatial rendering based on one down-mixed audio channel and side information. The companion paper (Part I) covers the psychoacoustic fundamentals of this method and outlines principles for the design of BCC schemes. The BCC analysis and synthesis methods of Part I are motivated and presented in the framework of stereophonic audio coding. Th...

متن کامل

Judging time-to-passage of looming sounds: Evidence for the use of distance-based information

Perceptual judgments are an essential mechanism for our everyday interaction with other moving agents or events. For instance, estimation of the time remaining before an object contacts or passes us is essential to act upon or to avoid that object. Previous studies have demonstrated that participants use different cues to estimate the time to contact or the time to passage of approaching visual...

متن کامل

Lateralization of binaural stimuli with independent fine-structure and envelope-based temporal disparities

A computational model for the lateralization of binaural stimuli, motivated by recent physiological findings in the literature and psychoacoustic data is presented. The model is based on the evaluation of the interaural phase difference (IPD). In the model, IPDs are separately assessed for the stimulus’ fine-structure and envelope. Psychoacoustic measurements were conducted and compared to mode...

متن کامل

Two-stage binaural speech enhancement with Wiener filter for high-quality speech communication

Speech enhancement has been researched extensively for many years to provide high-quality speech communication in the presence of background noise and concurrent interference signals. Human listening is robust against these acoustic interferences using only two ears, but state-of-the-art two-channel algorithms function poorly. Motivated by psychoacoustic studies of binaural hearing (equalizatio...

متن کامل

A common periodic representation of interaural time differences in mammalian cortex

Binaural hearing, the ability to detect small differences in the timing and level of sounds at the two ears, underpins the ability to localize sound sources along the horizontal plane, and is important for decoding complex spatial listening environments into separate objects - a critical factor in 'cocktail-party listening'. For human listeners, the most important spatial cue is the interaural ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Speech and Audio Processing

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2003