five

allenai/molmobot-data

收藏
Hugging Face2026-03-26 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/allenai/molmobot-data
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: odc-by configs: - config_name: DoorOpeningDataGenConfig data_files: - split: val_pkgs path: DoorOpeningDataGenConfig/val_pkgs-* - split: train_pkgs path: DoorOpeningDataGenConfig/train_pkgs-* - config_name: FrankaPickAndPlaceColorOmniCamConfig data_files: - split: val_pkgs path: FrankaPickAndPlaceColorOmniCamConfig/val_pkgs-* - split: train_pkgs path: FrankaPickAndPlaceColorOmniCamConfig/train_pkgs-* - config_name: FrankaPickAndPlaceNextToOmniCamConfig data_files: - split: val_pkgs path: FrankaPickAndPlaceNextToOmniCamConfig/val_pkgs-* - split: train_pkgs path: FrankaPickAndPlaceNextToOmniCamConfig/train_pkgs-* - config_name: FrankaPickAndPlaceOmniCamConfig data_files: - split: val_pkgs path: FrankaPickAndPlaceOmniCamConfig/val_pkgs-* - split: train_pkgs path: FrankaPickAndPlaceOmniCamConfig/train_pkgs-* - config_name: FrankaPickAndPlaceOmniCamConfig_ObjectBackfill data_files: - split: train_pkgs path: FrankaPickAndPlaceOmniCamConfig_ObjectBackfill/train_pkgs-* - config_name: FrankaPickOmniCamConfig data_files: - split: val_pkgs path: FrankaPickOmniCamConfig/val_pkgs-* - split: train_pkgs path: FrankaPickOmniCamConfig/train_pkgs-* - config_name: RBY1OpenDataGenConfig data_files: - split: val_pkgs path: RBY1OpenDataGenConfig/val_pkgs-* - split: train_pkgs path: RBY1OpenDataGenConfig/train_pkgs-* - config_name: RBY1PickAndPlaceDataGenConfig data_files: - split: val_pkgs path: RBY1PickAndPlaceDataGenConfig/val_pkgs-* - split: train_pkgs path: RBY1PickAndPlaceDataGenConfig/train_pkgs-* - config_name: RBY1PickDataGenConfig data_files: - split: val_pkgs path: RBY1PickDataGenConfig/val_pkgs-* - split: train_pkgs path: RBY1PickDataGenConfig/train_pkgs-* dataset_info: - config_name: DoorOpeningDataGenConfig features: - name: path dtype: string - name: shard_id dtype: int64 - name: offset dtype: int64 - name: size dtype: int64 - name: inflated_size dtype: int64 - name: part dtype: int64 splits: - name: val_pkgs num_bytes: 9061 num_examples: 104 - name: train_pkgs num_bytes: 1494065 num_examples: 16998 download_size: 1124564 dataset_size: 3386850 - config_name: FrankaPickAndPlaceColorOmniCamConfig features: - name: path dtype: string - name: shard_id dtype: int64 - name: offset dtype: int64 - name: size dtype: int64 - name: inflated_size dtype: int64 - name: part dtype: int64 splits: - name: val_pkgs num_bytes: 11402 num_examples: 115 - name: train_pkgs num_bytes: 533560 num_examples: 5341 download_size: 431185 dataset_size: 1078522 - config_name: FrankaPickAndPlaceNextToOmniCamConfig features: - name: path dtype: string - name: shard_id dtype: int64 - name: offset dtype: int64 - name: size dtype: int64 - name: inflated_size dtype: int64 - name: part dtype: int64 splits: - name: val_pkgs num_bytes: 9022 num_examples: 90 - name: train_pkgs num_bytes: 5859272 num_examples: 58074 download_size: 3158508 dataset_size: 11727566 - config_name: FrankaPickAndPlaceOmniCamConfig features: - name: path dtype: string - name: shard_id dtype: int64 - name: offset dtype: int64 - name: size dtype: int64 - name: inflated_size dtype: int64 - name: part dtype: int64 splits: - name: val_pkgs num_bytes: 22500 num_examples: 239 - name: train_pkgs num_bytes: 8781663 num_examples: 92543 download_size: 4218989 dataset_size: 17585826 - config_name: FrankaPickAndPlaceOmniCamConfig_ObjectBackfill features: - name: path dtype: string - name: shard_id dtype: int64 - name: offset dtype: int64 - name: size dtype: int64 - name: inflated_size dtype: int64 - name: part dtype: int64 splits: - name: train_pkgs num_bytes: 2882553 num_examples: 26231 download_size: 660231 dataset_size: 2882553 - config_name: FrankaPickOmniCamConfig features: - name: path dtype: string - name: shard_id dtype: int64 - name: offset dtype: int64 - name: size dtype: int64 - name: inflated_size dtype: int64 - name: part dtype: int64 splits: - name: val_pkgs num_bytes: 16688 num_examples: 194 - name: train_pkgs num_bytes: 6368227 num_examples: 73291 download_size: 3294808 dataset_size: 12753142 - config_name: RBY1OpenDataGenConfig features: - name: path dtype: string - name: shard_id dtype: int64 - name: offset dtype: int64 - name: size dtype: int64 - name: inflated_size dtype: int64 - name: part dtype: int64 splits: - name: val_pkgs num_bytes: 1817 num_examples: 22 - name: train_pkgs num_bytes: 893203 num_examples: 10522 download_size: 620623 dataset_size: 1788223 - config_name: RBY1PickAndPlaceDataGenConfig features: - name: path dtype: string - name: shard_id dtype: int64 - name: offset dtype: int64 - name: size dtype: int64 - name: inflated_size dtype: int64 - name: part dtype: int64 splits: - name: val_pkgs num_bytes: 2480 num_examples: 27 - name: train_pkgs num_bytes: 920114 num_examples: 9905 download_size: 580488 dataset_size: 1843823 - config_name: RBY1PickDataGenConfig features: - name: path dtype: string - name: shard_id dtype: int64 - name: offset dtype: int64 - name: size dtype: int64 - name: inflated_size dtype: int64 - name: part dtype: int64 splits: - name: val_pkgs num_bytes: 4283 num_examples: 51 - name: train_pkgs num_bytes: 2610373 num_examples: 30750 download_size: 1767900 dataset_size: 5237590 tags: - robotics - embodied ai - manipulation - mobile manipulation - pick-and-place - proprioception - multi-view video language: - en size_categories: - 1M<n<10M --- # MolmoBot-data Training episode data (actions, visual inputs, and other sensor data) for 8 tasks on 2 robotic platforms: - DoorOpeningDataGenConfig - RBY1OpenDataGenConfig - RBY1PickDataGenConfig - FrankaPickOmniCamConfig - RBY1PickAndPlaceDataGenConfig - FrankaPickAndPlaceOmniCamConfig - FrankaPickAndPlaceColorOmniCamConfig - FrankaPickAndPlaceNextToOmniCamConfig Please note that every package indexed by the parquet files can contan several instances of episode data. We also provide an extension dataset for `FrankaPickAndPlace` (not used to train any of our [shared models](https://huggingface.co/collections/allenai/molmobot-models)) with additional object types: - FrankaPickAndPlaceOmniCamConfig_ObjectBackfill ## Data access We recommend to use [bulk_download.py](bulk_download.py). For example, ```bash python bulk_download.py --split all --max_part_shards 1 --all /path/to/mbdata/ ``` will download and extract one shard from each part of each configuration in each available split under `/path/to/mbdata`. To launch it you might need to install some dependencies, e.g., by: ```bash pip install zstandard datasets huggingface_hub tqdm ``` A standalone example of how to access the episode data via streaming is also included in [stream_access_example.py](stream_access_example.py). ### Dataset post-processing and stats After bulk downloading data, please use - [repair_video_paths.py](repair_video_paths.py) to fix video paths, - [validate_trajectories.py](validate_trajectories.py) to extract valid trajectories from each dataset and part, and - [calculate_stats.py](calculate_stats.py) to extract dataset stats. For example: ```bash export SPLIT_DIR=/path/to/mbdata/DoorOpeningDataGenConfig/part0/train python scripts/data/repair_video_paths.py "$SPLIT_DIR" python scripts/data/validate_trajectories.py "$SPLIT_DIR" --check-visibility droid_shoulder_light_randomization pickup_obj --check-visibility droid_shoulder_light_randomization place_receptacle python scripts/data/calculate_stats.py "$SPLIT_DIR" --keys actions obs/agent/qpos ``` ## License This dataset is licensed under [ODC-BY 1.0](https://opendatacommons.org/licenses/by/1-0/). It is intended for research and educational use in accordance with Ai2's [Responsible Use Guidelines](https://allenai.org/responsible-use). The subset of Objaverse used to generate these data is licensed under [ODC-BY 1.0](https://opendatacommons.org/licenses/by/1-0/), with license information accessible, e.g., through the [license_info.py](license_info.py) script. The scripts are licensed under [Apache 2.0](CODE_LICENSE). ## BibTeX ``` @misc{deshpande2026molmobot, title={MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation}, author={Abhay Deshpande and Maya Guru and Rose Hendrix and Snehal Jauhri and Ainaz Eftekhar and Rohun Tripathi and Max Argus and Jordi Salvador and Haoquan Fang and Matthew Wallingford and Wilbert Pumacay and Yejin Kim and Quinn Pfeifer and Ying-Chun Lee and Piper Wolters and Omar Rayyan and Mingtong Zhang and Jiafei Duan and Karen Farley and Winson Han and Eli Vanderbilt and Dieter Fox and Ali Farhadi and Georgia Chalvatzaki and Dhruv Shah and Ranjay Krishna}, year={2026}, eprint={2603.16861}, archivePrefix={arXiv}, primaryClass={cs.RO}, url={https://arxiv.org/abs/2603.16861}, } ```
提供机构:
allenai
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作