Mirex-2010 “audio Key Detection” Task: Ircamkeymode
نویسنده
چکیده
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange (MIREX) 2010 for the “Audio Key Detection” task. The system named ircamkeymode performs key (C, Db, D, E, ...) and mode (Major, minor) detection. The system is a simplified version of the systems described in [1] [2]. We briefly summarized it below. 1. OVERVIEW OF THE MODEL 1.1 Chromagram extraction: The signal is first converted to mono and down-sampled to 11.025 Hz. At each frame, the DFT of the signal is computed using a Blackman analysis window of length L = 0.3715s with a hop-size of L/2. After normalization by its maximum value, the amplitude DFT is converted to a Sone scale. The computation of the sone-converted values is similar to the one used in [3]. Thresholding (below 1% of the max value) and peak-picking are then applied. A 36-bins (3 bins for each semi-tone) chroma representation [4] [5] is then computed. Only frequencies between 100 and 2000 Hz are considered. The shape of the chroma filters is chosen as an hyperbolic tangent with 50% overlap. Smoothing over time of each of the 36-chroma channels is performed using median filtering. 1.2 Key/Mode templates creation: We use an approach similar to Gomez [6]: the key profiles are created by extending Krumhansl & Schmukler (Temperley or Diatonic) pitch distribution profile to the polyphonic (several pitches) and audio (several harmonics for each pitch) cases. For each key, we consider the three main triads in this key: the tonic, dominant and subdominant triads (for example in C Major: C-E-G, G-B-D, F-A-C). The chroma vector corresponding to each single note of a specific triad is computed by adding the contribution of its harmonics h. The harmonic h is given a contribution of 0.6h−1. Only the first 4 harmonics are considered. For a specific triad, the chroma vectors corresponding to the three notes are added. Finally for a spePermission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. c © 2010 International Society for Music Information Retrieval. cific key, the key-chroma vector is computed by adding the three triad-chroma vectors. Each triad-chroma vector is weighted by the value of the Krumhansl’s (Temperley or Diatonic) profile at the position corresponding to the position of the root of the triad in the key (for example 6 for the F-A-C triad in C Major). The result is a 12 dimensions chroma profile vector for each of the 24 keys: Ci i ∈ [1, 24]. 1.3 Key/Mode decision: The most likely key/mode of the track is estimated using an approach similar to Izmirli [7]. The chroma vectors c(t) are extracted on a frame basis. At each time t, we estimate the key Ci that has the highest correlation (we use the cosine distance) with a cumulated-over-time chromavector 1 . We attribute a score to this key proportional to the distance between its correlation value and the correlation value of the second most likely key. This score acts as a reliability coefficient. The final key decision is chosen as the key with the maximum score cumulated over time. Only the first 20 seconds of the tracks are considered. 2. FLOWCHART OF THE MODEL
منابع مشابه
MIREX 2010 Audio Onset Detection
This paper presents an approach for the Audio Onset Detection task [1], which is submitted to MIREX 2010. In MIREX 2009, we presented our approach that utilizes information on the general characteristics of the notes for onset categorization, as well as integrates energy-based and pitch-based detection results. In MIREX 2010, we extend our submission to MIREX 2009 by parameters fine-tuning and ...
متن کاملMirex-2013 “audio Key Detection” Task: Ircamkeymode
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange (MIREX) 2013 for the “Audio Key Detection” task. The system named ircamkeymode performs key (C, Db, D, E, ...) and mode (Major, minor) detection. The system is a simplified version of the systems described in [6] [5]. We briefly summarized it below. 1. OVERVIEW OF THE MODEL 1.1 Chromagram extracti...
متن کاملMirex-2011 “audio Key Detection” Task: Ircamkeymode
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange (MIREX) 2011 for the “Audio Key Detection” task. The system named ircamkeymode performs key (C, Db, D, E, ...) and mode (Major, minor) detection. The system is a simplified version of the systems described in [6] [5]. We briefly summarized it below. 1. OVERVIEW OF THE MODEL 1.1 Chromagram extracti...
متن کاملMirex-2012 “audio Key Detection” Task: Ircamkeymode
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange (MIREX) 2012 for the “Audio Key Detection” task. The system named ircamkeymode performs key (C, Db, D, E, ...) and mode (Major, minor) detection. The system is a simplified version of the systems described in [6] [5]. We briefly summarized it below. 1. OVERVIEW OF THE MODEL 1.1 Chromagram extracti...
متن کاملMirex 2010: Joint Recognition of Key and Chord from Music Audio Signals Using Key-modulation Hmm
This extended abstract describes a submission to the Music Information Retrieval Evaluation eXchange 2010 (MIREX 2010) in the Audio Chord Estimation and Audio Key Detection tasks. We propose a new model to recognize musical keys and chords simultaneously from musical acoustic signals including key modulations. Chords and keys are closely related notions of music involving harmony. Since occurre...
متن کامل