Environmental Sound Recognition With Time-Frequency Audio Features
نویسندگان
چکیده
The paper considers the task of recognizing environmental sounds for the understanding of a scene or context surrounding an audio sensor. A variety of features have been proposed for audio recognition, including the popular Mel-frequency cepstral coefficients (MFCCs) which describe the audio spectral shape. Environmental sounds, such as chirpings of insects and sounds of rain which are typically noise-like with a broad flat spectrum, may include strong temporal domain signatures. However, only few temporal-domain features have been developed to characterize such diverse audio signals previously. Here, we perform an empirical feature analysis for audio environment characterization and propose to use the matching pursuit (MP) algorithm to obtain effective time–frequency features. The MP-based method utilizes a dictionary of atoms for feature selection, resulting in a flexible, intuitive and physically interpretable set of features. The MP-based feature is adopted to supplement the MFCC features to yield higher recognition accuracy for environmental sounds. Extensive experiments are conducted to demonstrate the effectiveness of these joint features for unstructured environmental sound classification, including listening tests to study human recognition capabilities. Our recognition system has shown to produce comparable performance as human listeners.
منابع مشابه
Using Spectro-Temporal Features for Environmental Sounds Recognition
The paper presents the task of recognizing environmental sounds for audio surveillance and security applications. A various characteristics have been proposed for audio classification, including the popular Mel-frequency cepstral coefficients (MFCCs) which give a description of the audio spectral shape. However, it exist some temporal-domain features. These last have been developed to character...
متن کاملComparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks
Recent successful applications of convolutional neural networks (CNNs) to audio classification and speech recognition have motivated the search for better input representations for more efficient training. Visual displays of an audio signal, through various time-frequency representations such as spectrograms offer a rich representation of the temporal and spectral structure of the original sign...
متن کاملA Robust Environmental Sound Recognition System using Frequency Domain Features
In ubiquitous environments, analysis and classification of sound plays a critical role in various acoustic-based recognition systems. This work aims to contribute towards building an automatic sound recognition system that can understand the surrounding environment by the audio information. In this paper, an acoustic signal based context awareness system is proposed for detecting sound events i...
متن کاملEnvironment recognition for digital audio forensics using MPEG-7 and mel cepstral features
Environment recognition from digital audio for forensics application is a growing area of interest. However, compared to other branches of audio forensics, it is a less researched one. Especially less attention has been given to detect environment from files where foreground speech is present, which is a forensics scenario. In this paper, we perform several experiments focusing on the problems ...
متن کاملAnalysis of spectrogram image methods for sound event classification
The time-frequency spectrogram representation of an audio signal can be visually analysed by a trained researcher to recognise any underlying sound events in a process called “spectrogram reading”. However, this has not become a popular approach for automatic classification, as the field is driven by Automatic Speech Recognition (ASR) where frame-based features are popular. As opposed to speech...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Audio, Speech & Language Processing
دوره 17 شماره
صفحات -
تاریخ انتشار 2009