thumos14

Name: thumos14
Creator: OpenDataLab
Published: 2026-05-17 11:30:31
License: 暂无描述

OpenDataLab2026-05-17 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/thumos14

下载链接

链接失效反馈

官方服务：

资源简介：

自动识别和定位大量来自野外视频的动作类别，对于视频理解和多媒体事件检测具有重要意义。 THUMOS 研讨会和挑战旨在探索在现实环境中使用大量来自开源视频的类的大规模动作识别的新挑战和方法。大多数现有的动作识别数据集都是由经过手动修剪以绑定感兴趣的动作的视频组成的。这已被确定为一个相当大的限制，因为它与在实际环境中应用动作识别的方式很不匹配。因此，THUMOS 2014 将对暂时未修剪的视频进行挑战。参与者可以使用修剪过的剪辑来训练他们的方法，但需要在未修剪的数据上测试他们的系统。一个新的前瞻性数据集包含超过 254 小时的视频数据和 2500 万帧，其中包含以下组件：训练集：来自 101 个动作类的 13,000 多个经过时间修剪的视频。验证集：超过 1000 个时间未修剪的视频，带有动作的时间注释。背景集：超过 2500 个相关视频保证不包含 101 个动作的任何实例。测试集：超过 1500 个暂时未修剪的视频，带有隐瞒的基本事实。时空注释：24 个动作类的边界框注释。所有视频均从 YouTube 收集，并提供其预提取的低级特征（改进的密集轨迹特征）。挑战的参赛作品将使用新的 THUMOS 2014 数据集在两个任务中进行评估：动作识别：接受超过 101 个类别的整个剪辑动作识别的提交。时间动作检测：接受关于 20 个动作类的动作识别和时间定位的提交。

Automatically recognizing and localizing action categories from large-scale unconstrained videos is of great significance for video understanding and multimedia event detection. The THUMOS workshop and challenge aim to explore new challenges and approaches for large-scale action recognition using action classes sourced from a large corpus of open-source videos in real-world scenarios. Most existing action recognition datasets are composed of videos that are manually trimmed to bound the target actions. This has been recognized as a significant limitation, as it poorly aligns with the practical application scenarios of action recognition. Therefore, the THUMOS 2014 challenge centers on temporally untrimmed videos. Participants are allowed to use trimmed clips for model training, but must test their systems on untrimmed datasets. This novel benchmark dataset contains over 254 hours of video data and 25 million frames, and consists of the following components: - Training Set: More than 13,000 temporally trimmed videos spanning 101 action classes. - Validation Set: Over 1,000 temporally untrimmed videos with temporal action annotations. - Background Set: More than 2,500 relevant videos that contain no instances of the 101 target actions. - Test Set: Over 1,500 temporally untrimmed videos with withheld ground truth. Spatiotemporal annotations: Bounding box annotations for 24 action classes. All videos are collected from YouTube, and pre-extracted low-level features (improved dense trajectory features) are provided. Challenge submissions will be evaluated on the THUMOS 2014 dataset across two tasks: 1. Action Recognition: Submissions for full-clip action recognition across the 101 action categories are accepted. 2. Temporal Action Detection: Submissions for action recognition and temporal localization on 20 action classes are accepted.

提供机构：

OpenDataLab

创建时间：

2022-09-01

搜集汇总

数据集介绍