Combining Features Reduces Hubness in Audio Similarity
نویسندگان
چکیده
In audio based music similarity, a well known effect is the existence of hubs, i.e. songs which appear similar to many other songs without showing any meaningful perceptual similarity. We verify that this effect also exists in very large databases (> 250000 songs) and that it even gets worse with growing size of databases. By combining different aspects of audio similarity we are able to reduce the hub problem while at the same time maintaining a high overall quality of audio similarity.
منابع مشابه
A MIREX Meta-analysis of Hubness in Audio Music Similarity
We use results from the 2011 MIREX “Audio Music Similarity and Retrieval” task for a meta analysis of the hub phenomenon. Hub songs appear similar to an undesirably high number of other songs due to a problem of measuring distances in high dimensional spaces. Comparing 17 algorithms we are able to confirm that different algorithms produce very different degrees of hubness. We also show that hub...
متن کاملAn Efficient Document Clustering Based on HUBNESS Proportional K-Means Algorithm
Evaluating similarity between the documents is a main operation in the text processing field. Similarity measurement is used to estimate the relationship between the records or documents.In existing system similarity between two documents can be computed with respect to feature by using Similarity Measure for Text Processing (SMTP). In proposed hybrid SMTP scheme is integrated with hubness base...
متن کاملHubness in the Context of Feature Selection
Hubness is a property of vector-space data expressed by the tendency of some points (hubs) to be included in unexpectedly many k-nearest neighbor (k-NN) lists of other points in a data set, according to commonly used similarity/distance measures. Alternatively, hubness can be viewed as increased skewness of the distribution of node in-degrees in the k-NN digraph obtained from a data set. Hubnes...
متن کاملCombining Chroma Features For Cover Version Identification
We present an approach for cover version identification which is based on combining different discretized features derived from the chromagram vectors extracted from the audio data. For measuring similarity between features, we use a parameter-free quasi-universal similarity metric which utilizes data compression. Evaluation proves that combined feature distances increase the accuracy in cover ...
متن کاملRidge Regression, Hubness, and Zero-Shot Learning
This paper discusses the effect of hubness in zero-shot learning, when ridge regression is used to find a mapping between the example space to the label space. Contrary to the existing approach, which attempts to find a mapping from the example space to the label space, we show that mapping labels into the example space is desirable to suppress the emergence of hubs in the subsequent nearest ne...
متن کامل