Memorability: Predicting movie and commercial memorability

See the MediaEval 2026 webpage for information on how to register and participate.

Task description

The goal of this task is to study the long-term memory performance when recognising small movie excerpts or commercial videos. We provide the videos, precomputed features or EEG features for the challenges proposed in the task such as how memorable a video, if a person is familiar with a video or if you can predict the brand memorability? Participants are free to use only the modalities relevant to their approach, enabling a broad range of methodologies.

Subtask 1: Movie Memorability. This task studies the long-term memory performance when recognizing small movie excerpts.

Subtask 2: Commercial/Ad Memorability. This task evaluates long-term memory performance in recognising commercial videos. Participants will use the VIDEM dataset, which contains commercial videos along with their memorability and brand memorability scores, to train their systems. The trained models will then predict the scores for new, unseen commercial videos (product, brand, and concept presentations and discussions). This challenge does not include EEG data.

Participating teams will write short working-notes papers that are published in the MediaEval Workshop Working Notes Proceedings. We welcome two types of papers: first, conventional benchmarking papers, which describe the methods that the teams use to address the task and analyze the results and, second, “Quest for Insight” papers, which address a question aimed at gaining more insight into the task, but do not necessarily present task results. Example questions for “Question for Insight” papers are below.

Motivation and background

In an era where visual content, such as movies and commercials, permeates our daily lives, understanding and predicting the memorability of multimedia content is becoming increasingly important. For marketers, filmmakers, and content creators, selecting and designing media that effectively captures attention and leaves a lasting impression is crucial for success. Commercials, in particular, need to engage viewers immediately and remain memorable to drive brand recognition and influence consumer behaviour. However, the potential applications of memorability prediction extend beyond commercial and advertising sectors.

This task aims to develop models that predict the memorability of multimedia content by leveraging various content features. While the results can directly benefit professionals in advertising and film, the insights gained can also be applied to other fields, such as education, content retrieval, and beyond. For instance, educators can use memorability predictions to create more engaging learning materials, while content retrieval systems can enhance search and recommendation accuracy by prioritising content with higher memorability potential.

This year’s task extends the state of the art by focusing on the memorability of multimedia content within the specific domains of movies and commercials. While previous research has explored the general memorability of videos and images, there has been limited focus on how this concept applies to the nuanced structure of films and advertisements. By addressing this gap, we aim to deepen our understanding of how human cognition interacts with multimedia, providing valuable insights into what makes content memorable and how it can be optimized for various applications across different industries, including both commercial and non-commercial use cases.

New for 2026.

For the 2026 edition, we are enhancing the provided datasets by releasing a set of semantic annotations, contextual information, and informative attributes for the existing video samples. By keeping the video set consistent but enriching the available data, we aim to encourage a more granular analysis of how specific video elements influence memorability.

Target group

Researchers interested in this task include those working in areas such as human perception, multimedia content analysis, cognitive science, and machine learning, particularly in the fields of image and video analysis, memorability, emotional response to media, aesthetics, and multimedia affective computing (though not limited to).

This includes scholars focused on predictive modeling, user experience, and the cognitive impact of media, with a specific interest in movies, commercials, and educational content. Signal processing researchers can also bring valuable insights to this task by leveraging EEG signals to enhance the memorability of predictive models. Additionally, researchers exploring content retrieval, recommendation systems, and multimedia interaction, as well as those studying the influence of media on memory and learning, will find the task valuable. It will also appeal to those working on improving machine learning algorithms for content classification and understanding, especially in video and image domains, and those interested in applying these models across both commercial and non-commercial media, including educational and informational content.

Data

One dataset will be provided for each subtask.

For subtask 1, a subset of the Movie Memorability dataset will be used. This is a collection of movie excerpts and corresponding ground-truth files based on the measurement of long-term memory performance when recognizing small movie excerpts from weeks to years after having viewed them. It is accompanied with audio and video features extracted from the movie excerpts. EEG data recorded while viewing this subset will be also provided. EEG data were recorded while 27 participants viewed a subset of clips from the dataset. The clips were selected to include both previously seen and unseen movies. After viewing each clip, participants were asked if they remembered seeing it before. In total 3484 epochs of 64 channel EEG data are available, of which 2122 were not recognized and 1362 were remembered.

For subtask 2, the VIDEM (VIDeo Effectiveness and Memorability) dataset will be used. It focuses on video and brand memorability in commercial advertisements, including some educational or explanatory videos. Developed through a university-business collaboration between the University of Essex and Hub, with support from Innovate UK’s Knowledge Transfer Partnership (grant agreement No. 11071. This is a collection of commercial advertisements and corresponding ground-truth files based on the measurement of long-term memory performance when recognizing them from 24 to 72 hours after having viewed them. Each video is accompanied with metadata such as titles, descriptions, number of views, and duration and audio and video features extracted from the commercial advertisements. The dataset consists of 424 commercial videos sampled from a larger collection of 4791 videos published on YouTube between June 2018 and June 2021. Video lengths range from 7 seconds to 94 minutes. For longer videos, users are allowed to watch only 1 minute.

Evaluation methodology

Submissions for the video-based prediction challenges will be evaluated using Spearman’s rank correlation coefficient. Additional metrics, such as Mean Squared Error (MSE), may also be used to assess prediction accuracy. For Challenge 1.2 (EEG-based detection of recall), submissions will be evaluated based on accuracy.

Quest for insight

[1] 2018 R.Cohendet, K. Yadati, N. Q. Duong and C.-H. Demarty. Annotating, understanding, and predicting long-term video memorability. In Proceedings of the ICMR 2018 Conference, Yokohama, Japan, June 11-14, 2018.

[2] 2025. R. S. Kiziltepe, S. Sahab, R. Valladares Santana, F. Doctor, K. Paterson, D. Hunstone and A. García Seco de Herrera. VIDEM: VIDeo Effectiveness and Memorability Dataset. In Proceedings of the 18th International Work-Conference on Artificial Neural Networks (IWANN 2025), A Coruña, Spain, June 16–18, 2025.

[3] 2014. Phillip Isola, Jianxiong Xiao, Devi Parikh, Antonio Torralba, and Aude Oliva. What makes a photograph memorable? IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 7 (2014), 1469–1482.

[4] 2023. Dumont, T., Hevia, J. S., & Fosco, C. L. Modular memorability. Tiered representations for video memorability prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10751-10760).

[5] 2025. Kumar, P. et al. Eye vs. AI: Human Gaze and Model Attention in Video Memorability. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AR, USA, 2025.

[6] 2025. SI, H.et al. Long-Term Memorability On Advertisements. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AR, USA, 2025.

Task organizers

Task schedule

The program will be updated with the exact dates.

Acknowledgements

More details will follow.