MUSAN: A Music, Speech, and Noise Corpus
نویسندگان
چکیده
This report introduces a new corpus of music, speech, and noise. This dataset is suitable for training models for voice activity detection (VAD) and music/speech discrimination. Our corpus is released under a flexible Creative Commons license. The dataset consists of music from several genres, speech from twelve languages, and a wide assortment of technical and non-technical noises. We demonstrate use of this corpus for music/speech discrimination on Broadcast news and VAD for speaker identification.
منابع مشابه
Non-negative matrix factorization based compensation of music for automatic speech recognition
This paper proposes to use non-negative matrix factorization based speech enhancement in robust automatic recognition of mixtures of speech and music. We represent magnitude spectra of noisy speech signals as the non-negative weighted linear combination of speech and noise spectral basis vectors, that are obtained from training corpora of speech and music. We use overcomplete dictionaries consi...
متن کاملTowards Automatic Intoxication Detection from Speech in Real-Life Acoustic Environments
In-car intoxication detection from speech is a highly promising non-intrusive method to reduce the accident risk associated with drunk driving. However, in-car noise significantly influences the recognition performance and needs to be addressed in practical applications. In this paper, we investigate how seriously the intrinsic in-car noise and background music affect the accuracy of intoxicati...
متن کاملAdaptive V/UV Speech Detection Based on Characterization of Background Noise
The paper presents an adaptive system for Voiced/Unvoiced (V/UV) speech detection in the presence of background noise. Genetic algorithms were used to select the features that offer the best V/UV detection according to the output of a background Noise Classifier (NC) and a Signal-to-Noise Ratio Estimation (SNRE) system. The system was implemented, and the tests performed using the TIMIT speech ...
متن کاملAudio Features for Noisy Sound Segmentation
Automatic audio classification usually considers sounds as music, speech, silence or noise, but works about the noise class are rare. Audio features are generally specific to speech or music signals. In this paper, we present a new audio feature sets that lead to the definition of four classes: colored, pseudo-periodic, impulsive and sinusoids within noises. This classification relies on works ...
متن کاملEfficient voice activity detection algorithm using long-term spectral flatness measure
This paper proposes a novel and robust voice activity detection (VAD) algorithm utilizing long-term spectral flatness measure (LSFM) which is capable of working at 10 dB and lower signal-to-noise ratios(SNRs). This new LSFM-based VAD improves speech detection robustness in various noisy environments by employing a low-variance spectrum estimate and an adaptive threshold. The discriminative powe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1510.08484 شماره
صفحات -
تاریخ انتشار 2015