Automatic Outlier Detection in Music Genre Datasets

نویسندگان

  • Yen-Cheng Lu
  • Chih-Wei Wu
  • Alexander Lerch
  • Chang-Tien Lu
چکیده

Outlier detection, also known as anomaly detection, is an important topic that has been studied for decades. An outlier detection system is able to identify anomalies in a dataset and thus improve data integrity by removing the detected outliers. It has been successfully applied to different types of data in various fields such as cyber-security, finance, and transportation. In the field of Music Information Retrieval (MIR), however, the number of related studies is small. In this paper, we introduce different state-of-the-art outlier detection techniques and evaluate their viability in the context of music datasets. More specifically, we present a comparative study of 6 outlier detection algorithms applied to a Music Genre Recognition (MGR) dataset. It is determined how well algorithms can identify mislabeled or corrupted files, and how much the quality of the dataset can be improved. Results indicate that state-of-the-art anomaly detection systems have problems identifying anomalies in MGR datasets reliably.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Training Set Reduction Based on 2-Gram Feature Statistics for Music Genre Recognition

Too large instance and/or feature number for supervised classification requires higher storage demands and computing time, and also the classification quality may suffer from too huge datasets. In our work we examine the reduction of training instance number in music genre recognition where each instance is mapped to a class described by a corresponding 2-gram estimated from the statistical dis...

متن کامل

Comparing Shallow versus Deep Neural Network Architectures for Automatic Music Genre Classification

In this paper we investigate performance differences of different neural network architectures on the task of automatic music genre classification. Comparative evaluations on four well known datasets of different sizes were performed including the application of two audio data augmentation methods. The results show that shallow network architectures are better suited for small datasets than dee...

متن کامل

شناسایی خودکار سبک موسیقی

Nowadays, automatic analysis of music signals has gained a considerable importance due to the growing amount of music data found on the Web. Music genre classification is one of the interesting research areas in music information retrieval systems. In this paper several techniques were implemented and evaluated for music genre classification including feature extraction, feature selection and m...

متن کامل

Calculation of climatic reference values and its use for automatic outlier detection in meteorological datasets

The climatic reference values for monthly and annual average air temperature and total precipitation in Catalonia – northeast of Spain – are calculated using a combination of statistical methods and geostatistical techniques of interpolation. In order to estimate the uncertainty of the method, the initial dataset is split into two parts that are, respectively, used for estimation and validation...

متن کامل

Audio content processing for automatic music genre classification: descriptors, databases, and classifiers

This dissertation presents, discusses, and sheds some light on the problems that appear when computers try to automatically classify musical genres from audio signals. In particular, a method is proposed for the automatic music genre classification by using a computational approach that is inspired in music cognition and musicology in addition to Music Information Retrieval techniques. In this ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016