five

MLL-Lab/tos-data

收藏
Hugging Face2026-02-11 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/MLL-Lab/tos-data
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: cc-by-4.0 size_categories: - 10K<n<100K task_categories: - robotics - visual-question-answering - image-to-text tags: - spatial-reasoning - 3d-scenes - vision-language - benchmark --- # Theory of Space: Visual Scene Dataset This dataset provides pre-rendered 3D multi-room environments for evaluating spatial reasoning in Vision Language Models (VLMs). It is designed to support the **Theory of Space (ToS)** benchmark, which tests whether foundation models can actively construct spatial beliefs through exploration. **Paper**: [Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?](https://huggingface.co/papers/2602.07055) **Project Page**: [https://theory-of-space.github.io](https://theory-of-space.github.io) **GitHub Repository**: [https://github.com/mll-lab-nu/Theory-of-Space](https://github.com/mll-lab-nu/Theory-of-Space) ## Dataset Overview | Property | Value | |----------|-------| | Rooms | 3 | | Total Runs | 100 | | Objects per Room | 4 | | Includes False-Belief Data | Yes | ## Usage ### Download Download via Hugging Face CLI: ```bash # Add huggingface token (optional, avoid 429 rate limit) # export HF_TOKEN= hf download MLL-Lab/tos-data --repo-type dataset --local-dir room_data ``` Or use the ToS setup script which downloads automatically: ```bash git clone --single-branch --branch release https://github.com/mll-lab-nu/Theory-of-Space.git cd Theory-of-Space source setup.sh ``` ### Sample Usage To run a full pipeline evaluation (explore + eval + cogmap) using the provided scripts: ```bash python scripts/SpatialGym/spatial_run.py \ --phase all \ --model-name gpt-5.2 \ --num 25 \ --data-dir room_data/3-room/ \ --output-root result/ \ --render-mode vision,text \ --exp-type active,passive \ --inference-mode batch ``` ## File Structure ``` tos-data/ └── runXX/ # 100 runs (run00 - run99) ├── meta_data.json # Scene metadata (layout, objects, positions) ├── falsebelief_exp.json # False-belief experiment data ├── top_down.png # Top-down view of the scene ├── top_down_annotated.png # Annotated top-down view ├── top_down_fbexp.png # Top-down view (false-belief state) ├── agent_facing_*.png # Agent perspective images (north/south/east/west) ├── <object_id>_facing_*.png # Object/door camera views ├── *_fbexp.png # False-belief experiment images └── top_down/ └── img_0000.png # Additional top-down renders ``` ## File Descriptions | File | Description | |------|-------------| | `meta_data.json` | Complete scene metadata including room layout, object positions, orientations, and connectivity | | `falsebelief_exp.json` | Specifies object modifications (move/rotate) for belief update evaluation | | `agent_facing_*.png` | Egocentric views from agent's perspective in 4 cardinal directions | | `<object_id>_facing_*.png` | Views from each object/door position | | `*_fbexp.png` | Images rendered after false-belief modifications | | `top_down*.png` | Bird's-eye view for visualization and debugging | ## Citation ```bibtex @inproceedings{zhang2026theoryofspace, title = {Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?}, author = {Zhang, Pingyue and Huang, Zihan and Wang, Yue and Zhang, Jieyu and Xue, Letian and Wang, Zihan and Wang, Qineng and Chandrasegaran, Keshigeyan and Zhang, Ruohan and Choi, Yejin and Krishna, Ranjay and Wu, Jiajun and Li, Fei-Fei and Li, Manling}, booktitle = {International Conference on Learning Representations (ICLR)}, year = {2026}, } ```
提供机构:
MLL-Lab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作