Automatic Outlier Detection in Music Genre Datasets
نویسندگان
چکیده
Outlier detection, also known as anomaly detection, is an important topic that has been studied for decades. An outlier detection system is able to identify anomalies in a dataset and thus improve data integrity by removing the detected outliers. It has been successfully applied to different types of data in various fields such as cyber-security, finance, and transportation. In the field of Music Information Retrieval (MIR), however, the number of related studies is small. In this paper, we introduce different state-of-the-art outlier detection techniques and evaluate their viability in the context of music datasets. More specifically, we present a comparative study of 6 outlier detection algorithms applied to a Music Genre Recognition (MGR) dataset. It is determined how well algorithms can identify mislabeled or corrupted files, and how much the quality of the dataset can be improved. Results indicate that state-of-the-art anomaly detection systems have problems identifying anomalies in MGR datasets reliably.
منابع مشابه
Training Set Reduction Based on 2-Gram Feature Statistics for Music Genre Recognition
Too large instance and/or feature number for supervised classification requires higher storage demands and computing time, and also the classification quality may suffer from too huge datasets. In our work we examine the reduction of training instance number in music genre recognition where each instance is mapped to a class described by a corresponding 2-gram estimated from the statistical dis...
متن کاملComparing Shallow versus Deep Neural Network Architectures for Automatic Music Genre Classification
In this paper we investigate performance differences of different neural network architectures on the task of automatic music genre classification. Comparative evaluations on four well known datasets of different sizes were performed including the application of two audio data augmentation methods. The results show that shallow network architectures are better suited for small datasets than dee...
متن کاملشناسایی خودکار سبک موسیقی
Nowadays, automatic analysis of music signals has gained a considerable importance due to the growing amount of music data found on the Web. Music genre classification is one of the interesting research areas in music information retrieval systems. In this paper several techniques were implemented and evaluated for music genre classification including feature extraction, feature selection and m...
متن کاملCalculation of climatic reference values and its use for automatic outlier detection in meteorological datasets
The climatic reference values for monthly and annual average air temperature and total precipitation in Catalonia – northeast of Spain – are calculated using a combination of statistical methods and geostatistical techniques of interpolation. In order to estimate the uncertainty of the method, the initial dataset is split into two parts that are, respectively, used for estimation and validation...
متن کاملAudio content processing for automatic music genre classification: descriptors, databases, and classifiers
This dissertation presents, discusses, and sheds some light on the problems that appear when computers try to automatically classify musical genres from audio signals. In particular, a method is proposed for the automatic music genre classification by using a computational approach that is inspired in music cognition and musicology in addition to Music Information Retrieval techniques. In this ...
متن کامل