VITATECS

Name: VITATECS
Creator: 国家多媒体信息处理重点实验室，北京大学计算机科学学院
Published: 2023-11-29 15:15:34
License: 暂无描述

arXiv2023-11-29 更新2024-06-21 收录

下载链接：

https://github.com/lscpku/VITATECS

下载链接

链接失效反馈

官方服务：

资源简介：

VITATECS是由北京大学国家多媒体信息处理重点实验室开发的一个视频文本数据集，专注于评估视频语言模型对时间概念的理解能力。该数据集包含超过13000个样本，通过大型语言模型和人类在环注释的半自动数据收集框架生成高质量的反事实描述。VITATECS旨在通过解耦时间信息和静态信息，填补时间概念理解评估的空白，并通过对时间概念的细粒度分类，如方向、强度、序列、定位、组合性和类型，来诊断视频语言模型的能力。数据集的应用领域包括视频描述、视频问答和视频文本检索，旨在解决现有数据集在时间理解能力评估上的不足。

VITATECS is a video-text dataset developed by the National Key Laboratory of Multimedia Information Processing at Peking University, which focuses on evaluating the temporal concept understanding capabilities of video-language models. This dataset contains over 13,000 samples, and generates high-quality counterfactual descriptions through a semi-automatic data collection framework that incorporates large language models and human-in-the-loop annotation. VITATECS aims to fill the gap in temporal concept understanding assessment by decoupling temporal information from static information, and diagnose the performance of video-language models via fine-grained classification of temporal concepts, including direction, intensity, sequence, localization, compositionality and type. Its application fields cover video captioning, video question answering and video-text retrieval, and it is designed to address the limitations of existing datasets in evaluating temporal understanding capabilities.

提供机构：

国家多媒体信息处理重点实验室，北京大学计算机科学学院

创建时间：

2023-11-29

5,000+

优质数据集

54 个

任务类型

进入经典数据集