SportsVideo: Fine Grained Action Classification and Position Detection in Table Tennis and Swimming Videos

See the MediaEval 2023 webpage for information on how to register and participate.

Task description

Positions and actions detection/classification are one of the main challenges in visual content analysis and mining. Sport video analysis has been a very popular research topic, due to the variety of application areas, ranging from analysis of athletes’ performances and rehabilitation to multimedia intelligent devices with user-tailored digests. We propose this year a series of 6 tasks, divided each into 2 sub-tasks for two sports, table tennis and swimming. Those tasks are a follow-up from the 2022 Sport Task and SwimTrack.

Task 1 - athletes positions detections

Task 2 - strokes detection

Task 3 - motion classification

Task 4 - field/table registration

Task 5 - sound detection

Task 6 - score and results extraction

Target group

The task is of interest to researchers in the areas of machine learning (classification), visual content analysis, computer vision and sport performance. We explicitly encourage researchers focusing specifically in domains of computer-aided analysis of sport performance.


Our focus is on recordings that have been made by both widespread and cheap video cameras, e.g. GoPro, but also high-quality videos, e.g. Blackmagick 4K.

Ground truth

Each video has been manually annotated by experts. For event-based annotations, we have annotated moments in the video that are relevant for the event. For positions we have annotated key and intermediate positions of the athlete and relied upon interpolation for the remaining positions.

Evaluation methodology

Each task will have its own evaluation methodology and will be provided once the dataset is released.

Quest for insight

Participant information

Please contact the task organizers by email if you have questions (see below).

The CRISP Project page

Pierre-Etienne Martin, Jenny Benois-Pineau, Renaud Péteri, Julien Morlier. Three-Stream 3D/1D CNN for Fine-Grained Action Classification and Segmentation in Table Tennis. 4th International ACM Workshop on Multimedia Content Analysis in Sports, ACM Multimedia, Oct 2021, Chengdu, China.

Kaustubh Milind Kulkarni, Sucheth Shenoy: Table Tennis Stroke Recognition Using Two-Dimensional Human Pose Estimation. CVPR Workshops 2021: 4576-4584.

Pierre-Etienne Martin, Jenny Benois-Pineau, Renaud Péteri, Julien Morlier. Fine grained sport action recognition with siamese spatio-temporal convolutional neural networks. Multimedia Tools and Applications, vol. 79, 20429–20447, Springer (2020).

Extended work in: Pierre-Etienne Martin. Fine-Grained Action Detection and Classification from Videos with Spatio-Temporal Convolutional Neural Networks. Application to Table Tennis. Neural and Evolutionary Computing [cs.NE]. Université de Bordeaux; Université de la Rochelle, 2020.

Gül Varol, Ivan Laptev, and Cordelia Schmid. Long-Term Temporal Convolutions for Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40, 6 (2018), 1510–1517.

Joao Carreira and Andrew Zisserman. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. CoRR abs/1705.07750 (2017).

Chunhui Gu, Chen Sun, Sudheendra Vijayanarasimhan, Caroline Pantofaru, David A. Ross, George Toderici, Yeqing Li, Susanna Ricco, Rahul Sukthankar, Cordelia Schmid, and Jitendra Malik. AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions. CoRR abs/1705.08421 (2017).

Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. UCF101: A dataset of 101 hu- man actions classes from videos in the wild. CoRR 1212.0402 (2012).

Nicolas Jacquelin, Romain Vuillemot, and Stefan Duffner. 2021. Detecting Swimmers in Unconstrained Videos with Few Training Data. 8th Workshop on Machine Learning and Data Mining for Sports Analytics (Sept. 2021).

T. F. H. Runia, C. G. M. Snoek, and A. W. M. Smeulders. 2018. Real-World Repetition Estimation by Div, Grad and Curl. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9009–9017.

Timothy Woinoski, Alon Harell, and I. Bajić. 2020. Towards Automated Swimming Analytics Using Deep Neural Networks. ArXiv (2020).

Task organizers


Task Schedule