nyu-visionx/VSI-590K
收藏Hugging Face2025-11-07 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/nyu-visionx/VSI-590K
下载链接
链接失效反馈官方服务:
资源简介:
VSI-590K是一个大规模的空间聚焦指令微调数据集,专注于空间推理。该数据集从多个来源精心挑选和注释,包含590,667个问答对,5,963个独特视频和44,858个独特图像。数据集包括注释的现实视频、模拟数据和未注释的现实视频,用于支持视觉问答、视频文本到文本和图像到文本等任务。
VSI-590K is a large-scale spatially-focused instruction-tuning dataset focusing on spatial reasoning. Curated from diverse sources and carefully annotated, the dataset includes 590,667 QA pairs, 5,963 unique videos, and 44,858 unique images, comprising annotated real videos, simulated data, and unannotated real videos for tasks such as visual question answering, video-text-to-text, and image-to-text.
提供机构:
nyu-visionx



