Fangjun/RoomSpace
收藏Hugging Face2024-06-20 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/Fangjun/RoomSpace
下载链接
链接失效反馈官方服务:
资源简介:
RoomSpace Benchmark数据集是一个用于空间推理评估的基准数据集,包含不同版本的房间样本。每个样本包括两个视角的图像(俯视图和北向视图)以及与之相关的问题和答案。问题和答案分为多种类型,如视觉问答、布局设置、TPP设置等,涵盖了不同的空间推理任务。数据集的分支结构展示了不同规模的样本数量,如100、1,000和10,000个房间样本。该数据集旨在为语言模型和空间推理任务提供真实世界的模拟基准。
RoomSpace Benchmark数据集是一个用于空间推理评估的基准数据集,包含不同版本的房间样本。每个样本包括两个视角的图像(俯视图和北向视图)以及与之相关的问题和答案。问题和答案分为多种类型,如视觉问答、布局设置、TPP设置等,涵盖了不同的空间推理任务。数据集的分支结构展示了不同规模的样本数量,如100、1,000和10,000个房间样本。该数据集旨在为语言模型和空间推理任务提供真实世界的模拟基准。
提供机构:
Fangjun
原始信息汇总
数据集概述
数据集配置
-
config_name: n3_m2_d144
- features:
- id: int64
- image_top_down: image
- image_north_facing: image
- question_image: string
- options_td: sequence: string
- options_sd: sequence: string
- answer_image_td: string
- answer_image_sd: string
- question_td_layout_yn: string
- answer_td_layout_yn: string
- question_td_layout_fr: string
- answer_td_layout_fr: sequence: string
- question_td_tpp_yn: string
- answer_td_tpp_yn: string
- question_td_tpp_fr: string
- answer_td_tpp_fr: sequence: string
- question_td_o2_yn: string
- answer_td_o2_yn: string
- question_td_o2_fr: string
- answer_td_o2_fr: sequence: string
- question_td_o2_d2_yn: string
- answer_td_o2_d2_yn: string
- question_td_o2_d2_fr: string
- answer_td_o2_d2_fr: sequence: string
- question_td_o2_d3_yn: string
- answer_td_o2_d3_yn: string
- question_td_o2_d3_fr: string
- answer_td_o2_d3_fr: sequence: string
- question_td_o2_layout_yn: string
- answer_td_o2_layout_yn: string
- question_td_o2_layout_fr: string
- answer_td_o2_layout_fr: sequence: string
- question_td_o2_d2_layout_yn: string
- answer_td_o2_d2_layout_yn: string
- question_td_o2_d2_layout_fr: string
- answer_td_o2_d2_layout_fr: sequence: string
- question_td_o2_d3_layout_yn: string
- answer_td_o2_d3_layout_yn: string
- question_td_o2_d3_layout_fr: string
- answer_td_o2_d3_layout_fr: sequence: string
- question_sd_layout_yn: string
- answer_sd_layout_yn: string
- question_sd_tpp_yn: string
- answer_sd_tpp_yn: string
- question_sd_o2_yn: string
- answer_sd_o2_yn: string
- question_sd_o2_fr: string
- answer_sd_o2_fr: sequence: string
- question_sd_o2_d2_yn: string
- answer_sd_o2_d2_yn: string
- question_sd_o2_d2_fr: string
- answer_sd_o2_d2_fr: sequence: string
- question_sd_o2_d3_yn: string
- answer_sd_o2_d3_yn: string
- question_sd_o2_d3_fr: string
- answer_sd_o2_d3_fr: sequence: string
- question_sd_o2_layout_yn: string
- answer_sd_o2_layout_yn: string
- question_sd_o2_layout_fr: string
- answer_sd_o2_layout_fr: sequence: string
- question_sd_o2_d2_layout_yn: string
- answer_sd_o2_d2_layout_yn: string
- question_sd_o2_d2_layout_fr: string
- answer_sd_o2_d2_layout_fr: sequence: string
- question_sd_o2_d3_layout_yn: string
- answer_sd_o2_d3_layout_yn: string
- question_sd_o2_d3_layout_fr: string
- answer_sd_o2_d3_layout_fr: sequence: string
- splits:
- test:
- num_bytes: 66958010.0
- num_examples: 100
- test:
- download_size: 65902762
- dataset_size: 66958010.0
- features:
-
config_name: n5_m4_d144
- features:
- id: int64
- image_top_down: image
- image_north_facing: image
- question_image: string
- options_td: sequence: string
- options_sd: sequence: string
- answer_image_td: string
- answer_image_sd: string
- question_td_layout_yn: string
- answer_td_layout_yn: string
- question_td_layout_fr: string
- answer_td_layout_fr: sequence: string
- question_td_tpp_yn: string
- answer_td_tpp_yn: string
- question_td_tpp_fr: string
- answer_td_tpp_fr: sequence: string
- question_td_o2_yn: string
- answer_td_o2_yn: string
- question_td_o2_fr: string
- answer_td_o2_fr: sequence: string
- question_td_o2_d2_yn: string
- answer_td_o2_d2_yn: string
- question_td_o2_d2_fr: string
- answer_td_o2_d2_fr: sequence: string
- question_td_o2_d3_yn: string
- answer_td_o2_d3_yn: string
- question_td_o2_d3_fr: string
- answer_td_o2_d3_fr: sequence: string
- question_td_o2_layout_yn: string
- answer_td_o2_layout_yn: string
- question_td_o2_layout_fr: string
- answer_td_o2_layout_fr: sequence: string
- question_td_o2_d2_layout_yn: string
- answer_td_o2_d2_layout_yn: string
- question_td_o2_d2_layout_fr: string
- answer_td_o2_d2_layout_fr: sequence: string
- question_td_o2_d3_layout_yn: string
- answer_td_o2_d3_layout_yn: string
- question_td_o2_d3_layout_fr: string
- answer_td_o2_d3_layout_fr: sequence: string
- question_sd_layout_yn: string
- answer_sd_layout_yn: string
- question_sd_tpp_yn: string
- answer_sd_tpp_yn: string
- question_sd_o2_yn: string
- answer_sd_o2_yn: string
- question_sd_o2_fr: string
- answer_sd_o2_fr: sequence: string
- question_sd_o2_d2_yn: string
- answer_sd_o2_d2_yn: string
- question_sd_o2_d2_fr: string
- answer_sd_o2_d2_fr: sequence: string
- question_sd_o2_d3_yn: string
- answer_sd_o2_d3_yn: string
- question_sd_o2_d3_fr: string
- answer_sd_o2_d3_fr: sequence: string
- question_sd_o2_layout_yn: string
- answer_sd_o2_layout_yn: string
- question_sd_o2_layout_fr: string
- answer_sd_o2_layout_fr: sequence: string
- question_sd_o2_d2_layout_yn: string
- answer_sd_o2_d2_layout_yn: string
- question_sd_o2_d2_layout_fr: string
- answer_sd_o2_d2_layout_fr: sequence: string
- question_sd_o2_d3_layout_yn: string
- answer_sd_o2_d3_layout_yn: string
- question_sd_o2_d3_layout_fr: string
- answer_sd_o2_d3_layout_fr: sequence: string
- splits:
- test:
- num_bytes: 67434227.0
- num_examples: 100
- test:
- download_size: 65991028
- dataset_size: 67434227.0
- features:
数据集结构
-
Images:
- image_top_down: Top-down images of the rooms.
- image_north_facing: North-facing rectangular images, assuming a robot stands at the south door, facing north.
-
Questions & Answers:
- Visual QA: (image_top_down + question_image) -- answer_image_td, (image_north_facing + question_image) -- answer_image_sd
- Layout Setting: (question_td_layout_yn -- answer_td_layout_yn), (question_td_layout_fr -- answer_td_layout_fr)
- TPP Setting: (question_td_tpp_yn -- answer_td_tpp_yn), (question_td_tpp_fr -- answer_td_tpp_fr)
- O2 Setting: (question_td_o2_yn -- answer_td_o2_yn), (question_td_o2_fr -- answer_td_o2_fr), (question_sd_o2_yn -- answer_sd_o2_yn), (question_sd_o2_fr -- answer_sd_o2_fr)
- O2+D2 Setting: (question_td_o2_d2_yn -- answer_td_o2_d2_yn), (question_td_o2_d2_fr -- answer_td_o2_d2_fr), (question_sd_o2_d2_yn -- answer_sd_o2_d2_yn), (question_sd_o2_d2_fr -- answer_sd_o2_d2_fr)
- O2+D3 Setting: (question_td_o2_d3_yn -- answer_td_o2_d3_yn), (question_td_o2_d3_fr -- answer_td_o2_d3_fr), (question_sd_o2_d3_yn -- answer_sd_o2_d3_yn), (question_sd_o2_d3_fr -- answer_sd_o2_d3_fr)
- Layout + O2 + D2 Setting: (question_td_o2_d2_layout_yn -- answer_td_o2_d2_layout_yn), (question_td_o2_d2_layout_fr -- answer_td_o2_d2_layout_fr), (question_sd_o2_d2_layout_yn -- answer_sd_o2_d2_layout_yn), (question_sd_o2_d2_layout_fr -- answer_sd_o2_d2_layout_fr)
- Layout + O2 + D3 Setting: (question_td_o2_d3_layout_yn -- answer_td_o2_d3_layout_yn), (question_td_o2_d3_layout_fr -- answer_td_o2_d3_layout_fr), (question_sd_o2_d3_layout_yn -- answer_sd_o2_d3_layout_yn), (question_sd_o2_d3_layout_fr -- answer_sd_o2_d3_layout_fr)
-
Question Types:
- fr: Find Relation
- yn: Yes/No
-
Points of View:
- td: Top-Down View
- sd: North Facing View (from the south door)
搜集汇总
数据集介绍

背景与挑战
背景概述
RoomSpace是一个用于评估语言模型空间推理能力的基准数据集,结合图像(俯视图和北向视图)与文本问答对,模拟真实世界房间场景。该数据集提供多视角(如俯视图和北向视图)和多类型问题(如视觉问答、布局设置等),旨在测试模型在定性推理任务中的性能,支持不同规模版本(如100、1K、10K样本)以进行灵活评估。
以上内容由遇见数据集搜集并总结生成



