Emotions and Themes in Music

See the MediaEval 2021 webpage for information on how to register and participate.

Task Description

Emotion and theme recognition is a popular task in music information retrieval that is relevant for music search and recommendation systems. We invite the participants to try their skills at recognizing moods and themes conveyed by the audio tracks.

This task involves the prediction of moods and themes conveyed by a music track, given the raw audio. The examples of moods and themes are: happy, dark, epic, melodic, love, film, space etc. Each track is tagged with at least one tag that serves as a ground-truth.

Participants are expected to train a model that takes raw audio as an input and outputs the predicted tags. To solve the task, participants can use any audio input representation they desire, be it traditional handcrafted audio features or spectrograms or raw audio inputs for deep learning approaches. We also provide a handcrafted feature set extracted by the Essentia audio analysis library as a reference. We allow usage of third-party datsets for model development and training, but it needs to be mentioned explicitly.

Target Group

Researchers in music information retrieval, music psychology, machine learning, and music and technology enthusiasts in general.

Data

The dataset used for this task is the autotagging-moodtheme subset of the MTG-Jamendo dataset [1], built using audio data from Jamendo and made available under Creative Commons licenses. This subset includes 18,486 audio tracks with mood and theme annotations. In total, there are 57 tags, and tracks can possibly have more than one tag.

We also provide pre-computed statistical features from Essentia using the feature extractor for AcousticBrainz. These features are were previously used in the MediaEval genre recognition tasks in 2017 and 2018.

Evaluation Methodology

Participants should generate predictions for the test split and submit those to the task organizers.

The generated outputs for the test dataset will be evaluated according to the following metrics that are commonly used in the evaluation of auto-tagging systems: Macro ROC-AUC and PR-AUC on tag prediction scores. Leaderboard will be based on PR-AUC.

For reference, here are the 2019 and 2020 editions of the task.

References and recommended reading

[1] Dmitry Bogdanov, Minz Won, Philip Tovstogan, Alastair Porter and Xavier Serra. 2019. The MTG-Jamendo dataset for automatic music tagging. Machine Learning for Music Discovery Workshop, International Conference on Machine Learning (ICML 2019).

[2] Dmitry Bogdanov, Alastair Porter, Philip Tovstogan and Minz Won. 2019. MediaEval 2019: Emotion and Theme Recognition in Music Using Jamendo. MediaEval 2019 Workshop.

[3] Dmitry Bogdanov, Alastair Porter, Philip Tovstogan and Minz Won. 2020. MediaEval 2020: Emotion and Theme Recognition in Music Using Jamendo. MediaEval 2020 Workshop.

[4] Mohammad Soleymani, Micheal N. Caro, Erik M. Schmidt, Cheng-Ya Sha and Yi-Hsuan Yang. 2013. 1000 songs for emotional analysis of music. In Proceedings of the 2nd ACM international workshop on Crowdsourcing for multimedia (CrowdMM 2013), 1-6.

[5] Anna Aljanaki, Yi-Hsuan Yang and Mohammad Soleymani. 2014. Emotion in music task at MediaEval 2014.

[6] Renato Panda, Ricardo Malheiro and Rui Pedro Paiva. 2018. Musical texture and expressivity features for music emotion recognition. In Proceedings of the International Society on Music Information Retrieval Conference (ISMIR 2018), 383-391.

[7] Cyril Laurier, Owen Meyers, Joan Serra, Martin Blech and Perfecto Herrera. 2009. Music mood annotator design and integration. In 7th International Workshop on Content-Based Multimedia Indexing (CBMI’09), 156-161.

[8] Youngmoo E. Kim, Erik M. Schmidt, Raymond Migneco, Brandon G. Morton, Patrick Richardson, Jeffrey Scott, Jacquelin A. Speck and Douglas Turnbull. 2010. Music emotion recognition: A state of the art review. In Proceedings of the International Society on Music Information Retrieval Conference (ISMIR2010), 255-266.

[9] Xiao Hu and J. Stephen Downie. 2007. Exploring Mood Metadata: Relationships with Genre, Artist and Usage Metadata. In Proceedings of the International Conference on Music Information Retrieval (ISMIR2007), 67-72.

Task Organizers

Philip Tovstogan, Music Technology Group, Universitat Pompeu Fabra, Spain
Dmitry Bogdanov, Music Technology Group, Universitat Pompeu Fabra, Spain
Alastair Porter, Music Technology Group, Universitat Pompeu Fabra, Spain
(first.last@upf.edu)

Task Schedule (Updated)

1 June: Data releases
12 November: Runs due
19 November: Results returned
29 November 2021: Working notes paper due
13-15 December 2021: MediaEval 2021 Workshop Online

Workshop will be held online. Exact dates to be announced.