Auditory Sketches: Very Sparse Representations of Sounds Are Still Recognizable
نویسندگان
چکیده
Sounds in our environment like voices, animal calls or musical instruments are easily recognized by human listeners. Understanding the key features underlying this robust sound recognition is an important question in auditory science. Here, we studied the recognition by human listeners of new classes of sounds: acoustic and auditory sketches, sounds that are severely impoverished but still recognizable. Starting from a time-frequency representation, a sketch is obtained by keeping only sparse elements of the original signal, here, by means of a simple peak-picking algorithm. Two time-frequency representations were compared: a biologically grounded one, the auditory spectrogram, which simulates peripheral auditory filtering, and a simple acoustic spectrogram, based on a Fourier transform. Three degrees of sparsity were also investigated. Listeners were asked to recognize the category to which a sketch sound belongs: singing voices, bird calls, musical instruments, and vehicle engine noises. Results showed that, with the exception of voice sounds, very sparse representations of sounds (10 features, or energy peaks, per second) could be recognized above chance. No clear differences could be observed between the acoustic and the auditory sketches. For the voice sounds, however, a completely different pattern of results emerged, with at-chance or even below-chance recognition performances, suggesting that the important features of the voice, whatever they are, were removed by the sketch process. Overall, these perceptual results were well correlated with a model of auditory distances, based on spectro-temporal excitation patterns (STEPs). This study confirms the potential of these new classes of sounds, acoustic and auditory sketches, to study sound recognition.
منابع مشابه
Learning of sparse auditory receptive fields
It is largely unknown how the properties of the auditory system relate to the properties of natural sounds. Here, we analyze representations of simulated neurons that have optimally sparse activity in response to spectrotemporal speech data. These representations share important properties with auditory neurons as determined in electrophysiological experiments.
متن کاملSparse Representation of Sounds in the Unanesthetized Auditory Cortex
How do neuronal populations in the auditory cortex represent acoustic stimuli? Although sound-evoked neural responses in the anesthetized auditory cortex are mainly transient, recent experiments in the unanesthetized preparation have emphasized subpopulations with other response properties. To quantify the relative contributions of these different subpopulations in the awake preparation, we hav...
متن کاملSound Retrieval and Ranking Using Sparse Auditory Representations
To create systems that understand the sounds that humans are exposed to in everyday life, we need to represent sounds with features that can discriminate among many different sound classes. Here, we use a sound-ranking framework to quantitatively evaluate such representations in a large-scale task. We have adapted a machine-vision method, the passive-aggressive model for image retrieval (PAMIR)...
متن کاملVocal Imitations of Non-Vocal Sounds
Imitative behaviors are widespread in humans, in particular whenever two persons communicate and interact. Several tokens of spoken languages (onomatopoeias, ideophones, and phonesthemes) also display different degrees of iconicity between the sound of a word and what it refers to. Thus, it probably comes at no surprise that human speakers use a lot of imitative vocalizations and gestures when ...
متن کاملPhysics-based and Spike-guided Tools for Sound Design
In this paper we present graphical tools and parameters search algorithms for the timbre space exploration and design of complex sounds generated by physical modeling synthesis. The tools are built around a sparse auditory representation of sounds based on Gammatone functions and provide the designer with both a graphical and an auditory insight. The auditory representation of a number of refer...
متن کامل