Hierarchical spike coding of sound

نویسندگان

  • Yan Karklin
  • Chaitanya Ekanadham
  • Eero P. Simoncelli
چکیده

Natural sounds exhibit complex statistical regularities at multiple scales. Acoustic events underlying speech, for example, are characterized by precise temporal and frequency relationships, but they can also vary substantially according to the pitch, duration, and other high-level properties of speech production. Learning this structure from data while capturing the inherent variability is an important first step in building auditory processing systems, as well as understanding the mechanisms of auditory perception. Here we develop Hierarchical Spike Coding, a two-layer probabilistic generative model for complex acoustic structure. The first layer consists of a sparse spiking representation that encodes the sound using kernels positioned precisely in time and frequency. Patterns in the positions of first layer spikes are learned from the data: on a coarse scale, statistical regularities are encoded by a second-layer spiking representation, while fine-scale structure is captured by recurrent interactions within the first layer. When fit to speech data, the second layer acoustic features include harmonic stacks, sweeps, frequency modulations, and precise temporal onsets, which can be composed to represent complex acoustic events. Unlike spectrogram-based methods, the model gives a probability distribution over sound pressure waveforms. This allows us to use the second-layer representation to synthesize sounds directly, and to perform model-based denoising, on which we demonstrate a significant improvement over standard methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proportional spike-timing precision and firing reliability underlie efficient temporal processing of periodicity and envelope shape cues.

Temporal sound cues are essential for sound recognition, pitch, rhythm, and timbre perception, yet how auditory neurons encode such cues is subject of ongoing debate. Rate coding theories propose that temporal sound features are represented by rate tuned modulation filters. However, overwhelming evidence also suggests that precise spike timing is an essential attribute of the neural code. Here ...

متن کامل

Coding of sound-source location by ensembles of cortical neurons.

We examined the coding of sound-source location by ensembles of neurons in the auditory cortex. Broadband noise bursts were presented from loudspeakers throughout 360 degrees in the horizontal plane. Sound levels varied from 20 to 40 dB above neural thresholds. We recorded temporal spike patterns simultaneously at 16 recording sites in area A2 of alpha-chloralose-anesthetized cats. Spike patter...

متن کامل

A Spike-Based Model of Binaural Sound Localization

This paper describes a spike-based model of binaural sound localization using interaural time differences (ITDs). To handle the problem of temporal coding and to facilitate a hardware implementation all neurons are simulated by a spike response model, which includes postsynaptic potentials (PSPs) and a refractory period. A winner-take-all (WTA) network selects the dominant source from the repre...

متن کامل

Temporal pattern recognition based on instantaneous spike rate coding in a simple auditory system.

Auditory pattern recognition by the CNS is a fundamental process in acoustic communication. Because crickets communicate with stereotyped patterns of constant frequency syllables, they are established models to investigate the neuronal mechanisms of auditory pattern recognition. Here we provide evidence that for the neural processing of amplitude-modulated sounds, the instantaneous spike rate r...

متن کامل

Title : Temporal coding by populations of auditory receptor neurons 1 Running head : Temporal coding by populations of receptor neurons

15 Auditory receptor neurons of crickets are most sensitive to either low or high 16 sound frequencies. Earlier work showed that the temporal coding properties of first-order 17 auditory interneurons are matched to the temporal characteristics of natural low18 frequency and high-frequency stimuli (cricket songs and bat echolocation calls, 19 respectively). We study the temporal coding propertie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Advances in neural information processing systems

دوره 2012  شماره 

صفحات  -

تاریخ انتشار 2012