ViTT 密集视频描述数据集

超神经2022-09-24 更新2024-05-15 收录

下载链接：

https://hyper.ai/cn/datasets/21095

下载链接

链接失效反馈

官方服务：

资源简介：

ViTT 全称 Video Timeline Tags，由 8,169 个视频组成，并带有人工生成的 segment-level 标注组成。其中，5,840 个视频被标注一次，其余视频被标注两次或更多。该数据集共发布了 12,461 组标注。该数据集中的视频来自 Youtube-8M 数据集。

ViTT, which stands for Video Timeline Tags, is a dataset composed of 8,169 videos accompanied by human-generated segment-level annotations. Specifically, 5,840 of the videos were annotated once, while the remaining videos received two or more annotations, leading to a total of 12,461 annotation sets being released. All videos in this dataset are sourced from the Youtube-8M dataset.

创建时间：

2022-09-24

搜集汇总

数据集介绍

背景与挑战

背景概述

ViTT密集视频描述数据集包含8,169个来自Youtube-8M数据集的视频，带有12,461组人工生成的segment-level标注，其中5,840个视频标注一次，其余标注多次。该数据集主要用于视频描述和视频理解的研究。

以上内容由遇见数据集搜集并总结生成