wovenbytoyota-vai/InstVL

Name: wovenbytoyota-vai/InstVL
Creator: wovenbytoyota-vai
Published: 2025-10-15 04:44:26
License: 暂无描述

Hugging Face2025-10-15 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/wovenbytoyota-vai/InstVL

下载链接

链接失效反馈

官方服务：

资源简介：

InstVL是一个大规模的实例感知时空视觉语言数据集，旨在弥合整体场景理解和细粒度实例级理解之间的差距。它提供了两个层次的详细文本注释：全局描述和实例描述。数据集包含超过340万个实例，分布在超过200万张图片和5万段视频中，为以实例为中心的预训练和基准测试提供了丰富的监督。

InstVL is a large-scale, instance-aware spatio-temporal vision-language dataset designed to bridge the gap between holistic scene understanding and fine-grained, instance-level comprehension. It provides two levels of detailed textual annotations: global captions and instance captions. The dataset contains over 3.4 million instances in over 2 million images and 50,000 videos, offering rich supervision for instance-centric pre-training and benchmarking.

提供机构：

wovenbytoyota-vai

5,000+

优质数据集

54 个

任务类型

进入经典数据集