MLL-Lab/tos-data

Name: MLL-Lab/tos-data
Creator: MLL-Lab
Published: 2026-02-11 03:27:39
License: 暂无描述

Hugging Face2026-02-11 更新2026-04-05 收录

下载链接：

https://hf-mirror.com/datasets/MLL-Lab/tos-data

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: cc-by-4.0 size_categories: - 10K<n<100K task_categories: - robotics - visual-question-answering - image-to-text tags: - spatial-reasoning - 3d-scenes - vision-language - benchmark --- # Theory of Space： Visual Scene Dataset This dataset provides pre-rendered 3D multi-room environments for evaluating spatial reasoning in Vision Language Models (VLMs). It is designed to support the **Theory of Space (ToS)** benchmark, which tests whether foundation models can actively construct spatial beliefs through exploration. **Paper**: [Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?](https://huggingface.co/papers/2602.07055) **Project Page**: [https://theory-of-space.github.io](https://theory-of-space.github.io) **GitHub Repository**: [https://github.com/mll-lab-nu/Theory-of-Space](https://github.com/mll-lab-nu/Theory-of-Space) ## Dataset Overview | Property | Value | |----------|-------| | Rooms | 3 | | Total Runs | 100 | | Objects per Room | 4 | | Includes False-Belief Data | Yes | ## Usage ### Download Download via Hugging Face CLI: ```bash # Add huggingface token (optional, avoid 429 rate limit) # export HF_TOKEN= hf download MLL-Lab/tos-data --repo-type dataset --local-dir room_data ``` Or use the ToS setup script which downloads automatically: ```bash git clone --single-branch --branch release https://github.com/mll-lab-nu/Theory-of-Space.git cd Theory-of-Space source setup.sh ``` ### Sample Usage To run a full pipeline evaluation (explore + eval + cogmap) using the provided scripts: ```bash python scripts/SpatialGym/spatial_run.py \ --phase all \ --model-name gpt-5.2 \ --num 25 \ --data-dir room_data/3-room/ \ --output-root result/ \ --render-mode vision,text \ --exp-type active,passive \ --inference-mode batch ``` ## File Structure ``` tos-data/ └── runXX/ # 100 runs (run00 - run99) ├── meta_data.json # Scene metadata (layout, objects, positions) ├── falsebelief_exp.json # False-belief experiment data ├── top_down.png # Top-down view of the scene ├── top_down_annotated.png # Annotated top-down view ├── top_down_fbexp.png # Top-down view (false-belief state) ├── agent_facing_*.png # Agent perspective images (north/south/east/west) ├── <object_id>_facing_*.png # Object/door camera views ├── *_fbexp.png # False-belief experiment images └── top_down/ └── img_0000.png # Additional top-down renders ``` ## File Descriptions | File | Description | |------|-------------| | `meta_data.json` | Complete scene metadata including room layout, object positions, orientations, and connectivity | | `falsebelief_exp.json` | Specifies object modifications (move/rotate) for belief update evaluation | | `agent_facing_*.png` | Egocentric views from agent's perspective in 4 cardinal directions | | `<object_id>_facing_*.png` | Views from each object/door position | | `*_fbexp.png` | Images rendered after false-belief modifications | | `top_down*.png` | Bird's-eye view for visualization and debugging | ## Citation ```bibtex @inproceedings{zhang2026theoryofspace, title = {Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?}, author = {Zhang, Pingyue and Huang, Zihan and Wang, Yue and Zhang, Jieyu and Xue, Letian and Wang, Zihan and Wang, Qineng and Chandrasegaran, Keshigeyan and Zhang, Ruohan and Choi, Yejin and Krishna, Ranjay and Wu, Jiajun and Li, Fei-Fei and Li, Manling}, booktitle = {International Conference on Learning Representations (ICLR)}, year = {2026}, } ```

提供机构：

MLL-Lab

5,000+

优质数据集

54 个

任务类型

进入经典数据集