VIOLIN

Name: VIOLIN
Creator: 卡内基梅隆大学
Published: 2020-03-26 04:39:05
License: 暂无描述

arXiv2020-03-26 更新2024-06-21 收录

下载链接：

https://github.com/jimmy646/violin

下载链接

链接失效反馈

官方服务：

资源简介：

VIOLIN是一个大规模的视频与语言推理数据集，由卡内基梅隆大学和微软Dynamics 365 AI Research合作创建。该数据集包含95,322个视频-假设对，来源于15,887个视频片段，总时长超过582小时。这些视频片段内容丰富，包括多种时间动态、事件转换和人物互动，主要从流行的电视节目和YouTube电影片段中收集。VIOLIN数据集旨在通过视频和文本的联合理解，测试模型的多模态推理能力，特别是在识别对象、理解事件因果关系等方面的深度常识推理。数据集的创建过程中，采用了严格的标注策略和质量控制，确保了数据的高质量和多样性。VIOLIN数据集的应用领域主要集中在视频与语言理解的研究，特别是在视频问答、视觉推理等任务中。

VIOLIN is a large-scale video-and-language reasoning dataset co-developed by Carnegie Mellon University and Microsoft Dynamics 365 AI Research. This dataset encompasses 95,322 video-hypothesis pairs, derived from 15,887 video clips with a total duration exceeding 582 hours. These video clips cover rich and diverse content, including various temporal dynamics, event transitions and interpersonal interactions, and are primarily collected from popular TV programs and YouTube movie clips. The VIOLIN dataset is intended to evaluate the multimodal reasoning capabilities of models through joint comprehension of video and text, especially deep commonsense reasoning in tasks such as object recognition and understanding of event causal relationships. During the dataset construction process, strict annotation strategies and quality control measures were adopted to ensure high data quality and diversity. The application fields of the VIOLIN dataset mainly focus on video-and-language understanding research, particularly in tasks like video question answering and visual reasoning.

提供机构：

卡内基梅隆大学

创建时间：

2020-03-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集