Binaural Scene Analysis and Automatic Speech Recognition

نویسندگان

  • Constantin Spille
  • Mathias Dietz
  • Volker Hohmann
  • Bernd T. Meyer
چکیده

The human auditory system is known to be able to easily analyze and decompose complex acoustic scenes into its constituent acoustic sources. This requires the integration of a multitude of acoustic cues, a phenomenon that is often referred to as cocktail-party processing. Auditory Scene Analysis, especially the segregation and comprehension of concurrent speakers, is one of the key features in cocktail-party processing [1].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی

In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...

متن کامل

On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis

What is the computational goal of auditory scene analysis? This is a key issue to address in the Marrian information-processing framework. It is also an important question for researchers in computational auditory scene analysis (CASA) because it bears directly on how a CASA system should be evaluated. In this chapter I discuss different objectives used in CASA. I suggest as a main CASA goal th...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Binaural Speech Separation Using Recurrent Timing Neural Networks for Joint F0-Localisation Estimation

A speech separation system is described in which sources are represented in a joint interaural time difference-fundamental frequency (ITD-F0) cue space. Traditionally, recurrent timing neural networks (RTNNs) have been used only to extract periodicity information; in this study, this type of network is extended in two ways. Firstly, a coincidence detector layer is introduced, each node of which...

متن کامل

Binaural Signal Processing for Enhanced Speech Recognition Robustness in Complex Listening Environments

This paper addresses the problem of automatic speech recognition (ASR) in the presence of room reverberation, speaker movements and highly non-stationary background noise on the basis of binaural microphone recordings. Investigations are conducted for Track 1 of the 2nd CHiME Speech Separation and Recognition Challenge, posing a small-vocabulary task that requires the recognition of a short key...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013