SportsVideo: Fine Grained Action Classification and Position Detection in Table Tennis and Swimming Videos

Task description

Positions and actions detection/classification are one of the main challenges in visual content analysis and mining. Sport video analysis has been a very popular research topic, due to the variety of application areas, ranging from analysis of athletes’ performances and rehabilitation to multimedia intelligent devices with user-tailored digests. We propose this year a series of 6 tasks, divided each into 2 sub-tasks for two sports, table tennis and swimming. Those tasks are a follow-up from the 2022 Sport Task and SwimTrack.

Task 1 - athletes positions detections

Task 2 - strokes detection

Task 3 - motion classification

Task 4 - field/table registration

Task 5 - sound detection

Task 6 - score and results extraction

Target group

The task is of interest to researchers in the areas of machine learning (classification), visual content analysis, computer vision and sport performance. We explicitly encourage researchers focusing specifically in domains of computer-aided analysis of sport performance.


Our focus is on recordings that have been made by both widespread and cheap video cameras, e.g. GoPro, but also high-quality videos, e.g. Blackmagick 4K.

Ground truth

Each video has been manually annotated by experts. For event-based annotations, we have annotated moments in the video that are relevant for the event. For positions we have annotated key and intermediate positions of the athlete and relied upon interpolation for the remaining positions.

Evaluation methodology

Each task will have its own evaluation methodology and will be provided once the dataset is released.

The CRISP Project page

