Mid-level representations for Computational Auditory Scene Analysis
نویسندگان
چکیده
In this paper we consider representations for use in models of the processing that occurs between the eardrum and our conscious experience of sound. We first list “good” properties for such mid-level representations, then present a framework within which to discuss some examples. We compare in detail two popular schemes — sinusoid tracks and correlograms — and propose a new representation, wefts, which seeks to combine their advantages.
منابع مشابه
Separation of speech from interfering sounds based on oscillatory correlation
A multistage neural model is proposed for an auditory scene analysis task--segregating speech from interfering sound sources. The core of the model is a two-layer oscillator network that performs stream segregation on the basis of oscillatory correlation. In the oscillatory correlation framework, a stream is represented by a population of synchronized relaxation oscillators, each of which corre...
متن کاملUnderconstrained Stochastic Representations for Top-down Computational Auditory Scene Analysis
Since Bregman published his unifying account of psychological results in auditory organization, Auditory Scene Analysis [1], there has been a series computational models of these principles. The dominant approach, as embodied in the dissertations of Cooke [2], Mellinger [3] and Brown [4], and elsewhere [5], may be characterized as follows: First the sound is processed by a conventional signalpr...
متن کاملBregman's Chimerae: Music Perception as Auditory Scene Analysis
Research into the perception and cognition of music listening often contains implicit assumptions about the nature of the underlying mental representations, and about the relationship between "auditory processing" and "music perception". We attempt to highlight and problemitize some of these assumptions and to provide a more cognitively appropriate model for music perception and cognition, base...
متن کاملAuditory Scene Analysis: Computational Models
Listeners have to make sense of a complex acoustic world containing overlapping sound sources that must be organized into individual auditory objects. Computational auditory scene analysis concerns the use of algorithms inspired by human sound perception whose aim is to extract properties of constituent sound sources in a complexmixture. Starting with representations based on models of how soun...
متن کامل16 Separation of Speech by Computational Auditory Scene Analysis
The term auditory scene analysis (ASA) refers to the ability of human listeners to form perceptual representations of the constituent sources in an acoustic mixture, as in the well-known ‘cocktail party’ effect. Accordingly, computational auditory scene analysis (CASA) is the field of study which attempts to replicate ASA in machines. Some CASA systems are closely modelled on the known stages o...
متن کامل