ellisbrown/SIMS-VSI
收藏Hugging Face2025-11-07 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/ellisbrown/SIMS-VSI
下载链接
链接失效反馈官方服务:
资源简介:
SIMS-VSI 是一个大规模的合成训练数据集,用于空间视频理解,包含 203,048 个问答对,跨越 2,507 个视频轨迹,通过 1,261 个程序生成的室内场景。该数据集利用 3D 模拟器中可用的特权信息来生成具有完美地面真实标注的空间丰富的视频训练数据。使用来自 AI2-THOR 和 ProcTHOR 的程序化生成的环境和 Objaverse 对象,我们捕获了代理导航轨迹,并以编程方式生成了各种空间推理问题,这些问题在真实世界的视频数据中标注起来会非常昂贵。
SIMS-VSI is a large-scale synthetic training dataset for spatial video understanding, comprising 203,048 question-answer pairs across 2,507 video trajectories through 1,261 procedurally generated indoor scenes. This dataset leverages the privileged information available in 3D simulators to generate spatially-rich video training data with perfect ground truth annotations. Using procedurally generated environments from AI2-THOR and ProcTHOR with Objaverse objects, we capture agent navigation trajectories and programmatically generate diverse spatial reasoning questions that would be prohibitively expensive to annotate in real-world video data.
提供机构:
ellisbrown



