MediaEval 2022

The MediaEval Multimedia Evaluation benchmark offers challenges in artificial intelligence for multimedia data. Participants address these challenges by creating algorithms for retrieval, analysis, and exploration. Solutions are systematically compared using a common evaluation procedure, making it possible to establish the state of the art and track progress. Our larger aim is to promote reproducible research that makes multimedia a positive force for society.

MediaEval goes beyond other benchmarks and data science challenges in that it also pursues a “Quest for Insight” (Q4I). With Q4I we push beyond only striving to improve evaluation scores to also working to achieve deeper understanding about the challenges. For example, properties of the data, strengths and weaknesses of particular types of approaches, and observations about the evaluation procedure.

The MediaEval 2022 Workshop will be held 12-13 January 2023, collocated with MMM 2023 in Bergen, Norway and also online. For the preliminary workshop proceedings and the workshop schedule, please see the MediaEval Workshop Information Announcement

Task schedule

The MediaEval Coordination Committee (2022)

MediaEval is grateful for the support of ACM Special Interest Group on Multimedia

Task List

DisasterMM: Multimedia Analysis of Disaster-Related Social Media Data

Contribute to disaster management by addressing two subtasks: Classify multimodal twitter data as relevant or non-relevant to flooding events and and develop a named-entity recognizer in order to identify which words (or sequence of words) in a tweet’s text refer to locations.

Read more.

Emotional Mario: A Game Analytics Challenge

Identify events of high significance in the Super Mario Bros. gameplay by analyzing facial expressions and the biometric data of players and then (optionally) creating a video summary of the best moments of play.

Read more.

FakeNews Detection

Participants address three fake news detection subtasks related to COVID-19-related conspiracy theories on twitter: First, text-based topic and conspiracy detection, second, graph based detection of users who post conspiracy theory (posters) in a social network graph with node attributes, and, third, a combination the two to achieve topic and conspiracy detection based both textual data and graphs.

Read more.

MUSTI - Multimodal Understanding of Smells in Texts and Images

Task participants develop classifiers to predict whether a text passage and an image evoke the same smell source or not and (optionally) dectors dentify common smell sources text passages and images.

Read more.

Medical Multimedia Task: Transparent Tracking of Spermatozoa

Detect and track spermatozoa in medical video, with the goal to create a real-time system. Calculate/predict attributes such as speed and travel distance.

Read more.


Participants are supplied with a large set of articles (including text body, and headlines) and the accompanying images from international publishers. The task requires participants to predict which image was used to accompany each article.

Read more.

NjordVid: Fishing Trawler Video Analytics Task

Task participants are provided with a surveillance video dataset from a fishing trawler. The overall objective of the task is to get more insight into the happenings on fishing trawlers but at the same time keep the privacy of fishing workers as high as possible. The first subtask is to create a method that is able to detect unforeseen events on the boat (anomalies). The seconds subtask is to come up with solutions to protect fishing workers' privacy but at the same time do not influence the automatic analysis of the video streams.

Read more.

Predicting Video Memorability

The task requires participants to automatically predict memorability scores for videos, that reflect the probability for a video to be remembered. Participants will be provided with an extensive data set of videos with memorability annotations, related information, pre-extracted state-of-the-art visual features, and Electroencephalography (EEG) recordings.

Read more.

Sport Task: Fine Grained Action Detection and Classification of Table Tennis Strokes from Videos.

This task offers two subtasks: classification of temporally segmented videos of single table tennis strokes and dection of strokes, regardlesss of its class, from untrimmed video.

Read more.

SwimTrack: Swimmers and Stroke Rate Detection in Elite Race Videos

The SwimTrack is a series of 5 multimedia tasks related to swimming video analysis from elite competition recordings. These tasks are related to video, image, and audio analysis which may be achieved independently. But when solved together, they form a grand challenge to provide sport federations and coaches with novel methods to assess and enhance swimmers’ performance.

Read more.

Urban Air: Urban Life and Air Pollution

The task requires participants to tackle two subtasks: multimodal/crossmodal air pollution prediction and periodic traffic-pollution patterns discovery. The first requires participants to predict Air Quality Index (AQI) in the short- and mid-term future using multimodal/cross-modal data. Remarkably, the participants must predict AQI using (1) only station data, and (2) station and CCTV data. The second requires participants to discover periodic traffic-pollution patterns that can bring citizens' awareness of traffic-pollution mutual impacts using the given datasets.

Read more.