AVA-理解人类动作的精细标记视频数据集

Name: AVA-理解人类动作的精细标记视频数据集
Creator: 帕依提提
License: 暂无描述

帕依提提2024-03-04 收录

下载链接：

https://www.payititi.com/opendatasets/show-1753.html

下载链接

链接失效反馈

官方服务：

资源简介：

教机器理解视频中的人类动作是计算机视觉的一个基本研究课题，对于个人视频搜索和发现、运动分析和手势接口等应用必不可少。过去几年来，在图像中分类和查找对象取得了令人兴奋的突破，但识别人类动作仍然是一个巨大的挑战。原因在于，就其本性而言，人类动作的定义不如视频对象完善，因此，很难构建精细标记的动作视频数据集。尽管有许多基准数据集（如 UCF101、ActivityNet 和 DeepMind 的 Kinetics）采用图像分类标记模式，并为数据集中的每个视频或视频剪辑分配一个标签，但对于有多人执行不同动作的复杂场景，还没有相应的数据集。为促进对人类动作识别的进一步研究，我们发布了 AVA，它诞生于“原子视觉动作”，是一个全新的数据集，为扩展视频序列中的每个人提供多个动作标签。AVA 由 YouTube 中公开视频的网址组成，注解了一组 80 种时空局部化的原子动作（如“走”、“踢（物体）”、“握手”等），产生了 5.76 万个视频片段、9.6 万个标记动作执行人以及总共 21 万个动作标签。

Teaching machines to understand human actions in videos is a fundamental research topic in computer vision, and is indispensable for applications such as personal video search and discovery, motion analysis, and gesture interfaces. Over the past few years, exciting breakthroughs have been made in classifying and detecting objects in images, but recognizing human actions remains a formidable challenge. The reason lies in that, by their nature, human actions are less well-defined than visual objects in videos, making it difficult to construct finely annotated video datasets for actions. Although many benchmark datasets (e.g., UCF101, ActivityNet, and DeepMind's Kinetics) adopt the image classification annotation paradigm and assign a single label to each video or video clip in the dataset, there is no corresponding dataset for complex scenarios where multiple people perform distinct actions. To advance further research on human action recognition, we present AVA, which stands for "Atomic Visual Actions" and is a novel dataset that provides multiple action labels for each individual in extended video sequences. AVA consists of URLs of publicly available videos from YouTube, annotated with a set of 80 spatially-temporally localized atomic actions (e.g., "walking", "kicking (object)", "handshaking", etc.), resulting in 576,000 video clips, 96,000 annotated action performers, and a total of 210,000 action labels.

提供机构：

帕依提提

搜集汇总

数据集介绍

背景与挑战

背景概述

AVA数据集是一个精细标记的人类动作视频数据集，包含来自YouTube的5.76万个视频片段和9.6万个标记动作执行人，标注了80种原子动作，旨在促进人类动作识别的研究。

以上内容由遇见数据集搜集并总结生成