Hierarchical Spectro-Temporal Models for Speech Recognition

نویسندگان

  • Jake Bouvrie
  • Tony Ezzat
  • Tomaso Poggio
چکیده

We seek to explore computational approaches for audition that are inspired by computational visual neuroscience. In particular, we seek to leverage recent progress over the past few years in building a biologically-faithful hierarchical, feed-forward system for visual object recognition [13,14]. The system, which was designed to closely match the currently known feed-forward path in the ventral stream in visual cortex, processes 2-D images in a feed-forward, hierarchical way to determine the category and identity of a particular object within that image. The system is capable of recognizing the object in the image irrespective of variations in position, scale, orientation, and in the presence of clutter.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...

متن کامل

Generalization performance of spetro-temporal speech features

Introduction Despite the fact that the dynamic aspects of speech are very important, conventional speech features as Mel Ceptstral Coefficients (Mfccs) [1] and RelAtive SpecTrAl Perceptual Linear Predictive (Rasta-Plp) features [2] capture only stationary spectral information. We could previously show that a combination of conventional speech features with spectro-temporal speech features yield...

متن کامل

Investigating the Complementarity of Spectral and Spectro-temporal Features

Most common speech features as Mel Ceptstral Coefficients (MFCCs) and RelAtive SpecTrAl Perceptual Linear Predictive RASTA-PLP features use only spectral information. However, from measurements in the mammalian auditory cortex it is known that the mammalian brain jointly uses spectral and temporal information. To model this we previously developed Hierarchical SpectroTemporal (HIST) features [1...

متن کامل

Multi-stream spectro-temporal features for robust speech recognition

A multi-stream approach to utilizing the inherently large number of spectro-temporal features for speech recognition is investigated in this study. Instead of reducing the featurespace dimension, this method divides the features into streams so that each represents a patch of information in the spectrotemporal response field. When used in combination with MFCCs for speech recognition under both...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007