Singing Voice Detection in Electronic Music with a Long-Term Recurrent Convolutional Network
نویسندگان
چکیده
Singing Voice Detection (SVD) is a classification task that determines whether there singing voice in given audio segment. While current systems produce high-quality results on this task, the reported experiments are usually limited to popular music. A Long-Term Recurrent Convolutional Network (LRCN) was adapted detect vocals new dataset of electronic music evaluate its performance different genre and compare against those other state-of-the-art pop prove effectiveness across genre. Experiments two datasets studied impacts features block size LRCN temporal relationship learning, benefits preprocessing performance, generate benchmark intricacies.
منابع مشابه
Effective Singing Voice Detection in Popular Music Using Arma Filtering
Locating singing voice segments is essential for convenient indexing, browsing and retrieval large music archives and catalogues. Furthermore, it is beneficial for automatic music transcription and annotations. The approach described in this paper usesMel-Frequency Cepstral Coefficients in conjunction with Gaussian Mixture Models for discriminating two classes of data (instrumental music and si...
متن کاملContext-Aware Features for Singing Voice Detection in Polyphonic Music
The effectiveness of audio content analysis for music retrieval may be enhanced by the use of available metadata. In the present work, observed differences in singing style and instrumentation across genres are used to adapt acoustic features for the singing voice detection task. Timbral descriptors traditionally used to discriminate singing voice from accompanying instruments are complemented ...
متن کاملSinging Voice Detection in North Indian Classical Music
Singing voice detection is essential for contentbased applications such as those involving melody extraction and singer identification. This article is concerned with the accurate detection of singing voice phrases in north Indian classical vocal music. The component sound sources in such music fit into a typical framework (voice, rhythm and drone). We have used this a-priori knowledge to enhan...
متن کاملA Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection
—Traditional saliency models usually adopt hand-crafted image features and human-designed mechanisms to calculate local or global contrast. In this paper, we propose a novel computational saliency model, i.e., deep spatial contextual long-term recurrent convolutional network (DSCLRCN) to predict where people looks in natural scenes. DSCLRCN first automatically learns saliency related local feat...
متن کاملSinging voice detection in polyphonic music using predominant pitch
This paper demonstrates the superiority of energy-based features derived from the knowledge of predominant-pitch, for singing voice detection in polyphonic music over commonly used spectral features. However, such energy-based features tend to misclassify loud, pitched instruments. To provide robustness to such accompaniment we exploit the relative instability of the pitch contour of the singin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2022
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app12157405