Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations
نویسندگان
چکیده
منابع مشابه
Speech - Nonspeech discrimination based on speech-relevant spectrogram modulations
In this work, we adopt an information theoretic approach the Information Bottleneck method to extract the relevant modulation frequencies across both dimensions of a spectrogram, for speech / non-speech discrimination (music, animal vocalizations, environmental noises). A compact representation is built for each sound ensemble, consisting of the maximally informative features. We demonstrate th...
متن کاملSpectro-temporal Modulations for Robust Speech Emotion Recognition Spectro-temporal Modulations for Robust Speech Emotion Recognition
متن کامل
Speech-nonspeech discrimination using the information bottleneck method and spectro-temporal modulation index
In this work, we adopt an information theoretic approach the Information Bottleneck method to extract the relevant spectrotemporal modulations for the task of speech / non-speech discrimination non-speech events include music, noise and animal vocalizations. A compact representation (a “cluster prototype”) is built for each class consisting of the maximally informative features with respect to ...
متن کاملSpectro-temporal modulations for robust speech emotion recognition
Speech emotion recognition is mostly considered in clean speech. In this paper, joint spectro-temporal features (RS features) are extracted from an auditory model and are applied to detect the emotion status of noisy speech. The noisy speech is derived from the Berlin Emotional Speech database with added white and babble noises under various SNR levels. The clean train/noisy test scenario is in...
متن کاملMethods for capturing spectro-temporal modulations in automatic speech recognition
Psychoacoustical and neurophysiological results indicate that spectro-temporal modulations play an important role in sound perception. Speech signals, in particular, exhibit distinct spectro-temporal patterns which are well matched by receptive fields of cortical neurons. In order to improve the performance of automatic speech recognition (ASR) systems a number of different approaches are prese...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Audio, Speech and Language Processing
سال: 2006
ISSN: 1558-7916
DOI: 10.1109/tsa.2005.858055