Exploring the Music Genome: Lyric Clustering with Heterogeneous Features

نویسنده

  • Tamsin Maxwell
چکیده

This research explores the clustering of songs using lyrics features grouped into similar classes and heterogeneous combinations. Simple techniques are used to extract 140 features for analysis with Kohonen self-organising maps. These maps are evaluated using visual analysis and objective measures of validity with respect to the clustering of eight hand-selected song pairs. According to gold standard human-authored playlists, judgments of song similarity are based strongly on music, however this observation may be limited to playlists and is not necessarily extensible to music in the wider domain. In particular, since test song pairs could only be effectively matched when they were from the same genre, analysis of the correspondence between lyrics and expert human judgments of genre and style may be more fruitful than comparison with similarities observed in playlists. Results suggest that for music in the hard-to-differentiate categories of pop, rock and related genres, a combination of features relating to language, grammar, sentiment and repetition improve on the clustering performance of Information Space with a more accurate analysis of song similarity and increased sensitivity to the nuances of song style. SOM analysis further suggests that a few well-chosen attributes may be as good as, if not better than, deep analysis using many features. Results using stress patterns are inconclusive. Although results are preliminary and need to be validated with further research on a larger data set, to the knowledge of this author this is the first time success has been reported in differentiating songs in the rock/pop genre.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lyric Text Mining in Music Mood Classification

This research examines the role lyric text can play in improving audio music mood classification. A new method is proposed to build a large ground truth set of 5,585 songs and 18 mood categories based on social tags so as to reflect a realistic, user-centered perspective. A relatively complete set of lyric features and representation models were investigated. The best performing lyric feature s...

متن کامل

Discourse Analysis of Lyric and Lyric-Based Classification of Music

Lyrics play an important role in the semantics and the structure of many pieces of music. However, while many existing lyric analysis systems consider each sentence of a given set of lyrics separately, lyrics are more naturally understood as multi-sentence units, where the relations between sentences is a key factor. Here we describe a series of experiments using discourse-based features, which...

متن کامل

Music Emotion Regression based on Multi-modal Features1

Music emotion regression is considered more appropriate than classification for music emotion retrieval, since it resolves some of the ambiguities of emotion classes. In this paper, we propose an AdaBoost-based approach for music emotion regression, in which emotion is represented in PAD model and multi-modal features are employed, including audio, MIDI and lyric features. We first demonstrate ...

متن کامل

Exploration of Music Emotion Recognition Based on MIDI

Audio and lyric features are commonly considered in the research of music emotion recognition, whereas MIDI features are rarely used. Some research revealed that among the features employed in music emotion recognition, lyric has the best performance on valence, MIDI takes the second place, and audio is the worst. However, lyric cannot be found in some music types, such as instrumental music. I...

متن کامل

Automatic Prediction of Hit Songs

hit song detection, music classification We explore the automatic analysis of music to identify likely hit songs. We extract both acoustic and lyric information from each song and separate hits from non-hits using standard classifiers, specifically Support Vector Machines and boosting classifiers. Our features are based on global sounds learnt in an unsupervised fashion from acoustic data or gl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007