MolmoAct-Pretraining-Mixture
收藏魔搭社区2026-01-06 更新2025-08-16 收录
下载链接:
https://modelscope.cn/datasets/allenai/MolmoAct-Pretraining-Mixture
下载链接
链接失效反馈官方服务:
资源简介:
# MolmoAct - Pretraining Mixture
Data Mixture used for MolmoAct Pretraining. Contains a subset of OXE formulated as Action Reasoning Data along with auxiliary robot data and link to Multimodal Web data.
MolmoAct is a fully open-source action reasoning model for robotic manipulation developed by the Allen Institute for AI. MolmoAct is trained on a subset of OXE and MolmoAct Dataset, a dataset with 10k high-quality trajectories of a single-arm Franka robot performing 93 unique manipulation tasks in both home and tabletop environments. It has state-of-the-art performance among vision-language-action models on multiple benchmarks while being fully open-source. You can find all models in the MolmoAct family [here](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7).
**Learn more about MolmoAct** in our announcement [blog post](https://allenai.org/blog/molmoact) or the [paper](https://huggingface.co/allenai/MolmoAct-7B-D-0812/blob/main/MolmoAct_Technical_Report.pdf).
## Dataset Description
**MolmoAct - Pretraining Mixture** contains third party content from Open X-Embodiment. (The data from other sources, Pixmo and Molmo Academic Dataset, will be referenced in the dataset card by linking only - i.e. we are not actually including the data from these sources in the pretraining mixture, so the only data in this dataset is from Open X-Embodiment). We convert the raw robot data using Depth-Anything v2 and Molmo 7B to Action Reasoning Data.
LVIS Bounding Box Dataset can be downloaded from [here](https://huggingface.co/datasets/wentao-yuan/robopoint-data).
Pixmo and Molmo Academic dataset can be downlaoded from [here](https://github.com/allenai/molmo).
## Dataset Statistics
- bc_z: 10289224 samples
- fractal20220817_data: 7065568 samples
- bridge_dataset: 3746792 samples
- auxiliary_depth: 1500000 samples
- auxiliary_line: 1500000 samples
Quick links:
- 📂 [All Models](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)
- 📂 [All Data](https://huggingface.co/collections/allenai/molmoact-data-mixture-6897e583e13b6c2cf3ea2b80)
- 📃 [Paper](https://arxiv.org/abs/2508.07917)
- 🎥 [Blog Post](https://allenai.org/blog/molmoact)
- 🎥 [Video](https://youtu.be/-_wag1X25OE?si=Xi_kUaJTmcQBx1f6)
- [Code](https://github.com/allenai/MolmoAct/tree/main)
## License and Use
This dataset is licensed under CC BY-4.0. It is intended for research and educational use in accordance with [Ai2's Responsible Use Guidelines](https://allenai.org/responsible-use). The data is based on the RT-1 Robot Action, Berkeley Bridge, and BC-Z datasets from [Open X-Embodiment](https://github.com/google-deepmind/open_x_embodiment?tab=readme-ov-file). All other datasets linked in the documentation are subject to the respective licenses governing their use.
## Citation
```
@misc{molmoact2025,
title={MolmoAct: Action Reasoning Models that can Reason in Space},
author={Jason Lee and Jiafei Duan and Haoquan Fang and Yuquan Deng and Shuo Liu and Boyang Li and Bohan Fang and Jieyu Zhang and Yi Ru Wang and Sangho Lee and Winson Han and Wilbert Pumacay and Angelica Wu and Rose Hendrix and Karen Farley and Eli VanderBilt and Ali Farhadi and Dieter Fox and Ranjay Krishna},
year={2025},
eprint={2508.07917},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2508.07917}
}
```
# MolmoAct —— 预训练混合数据集
本数据集为MolmoAct预训练所用的数据混合集,包含经处理为动作推理数据的开放X实体(Open X-Embodiment,简称OXE)子集,辅以机器人辅助数据,并关联多模态网页数据。
MolmoAct是由艾伦人工智能研究所(Allen Institute for AI)开发的全开源机器人操作动作推理模型。该模型基于OXE子集与MolmoAct数据集训练而成,后者包含10,000条高质量单臂Franka机器人(Franka robot)轨迹,该机器人可在家庭与桌面场景中完成93种独特操作任务。其在多项基准测试中展现出视觉-语言-动作模型(vision-language-action model)领域的顶尖性能,且完全开源。您可通过[此处](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)获取所有MolmoAct系列模型。
您可通过我们的[官方博客](https://allenai.org/blog/molmoact)或[技术论文](https://huggingface.co/allenai/MolmoAct-7B-D-0812/blob/main/MolmoAct_Technical_Report.pdf)了解更多MolmoAct相关信息。
## 数据集说明
**MolmoAct —— 预训练混合数据集** 包含来自开放X实体(Open X-Embodiment)的第三方内容。(其他来源的数据如Pixmo与Molmo学术数据集,仅会通过链接方式在数据集卡片中引用——换言之,本预训练混合集并未实际包含这些来源的数据,因此本数据集的唯一数据来源为开放X实体)。我们通过深度Anything v2(Depth-Anything v2)与Molmo 7B将原始机器人数据转换为动作推理数据。
LVIS边界框数据集(LVIS Bounding Box Dataset)可通过[此处](https://huggingface.co/datasets/wentao-yuan/robopoint-data)下载。
Pixmo与Molmo学术数据集可通过[此处](https://github.com/allenai/molmo)下载。
## 数据集统计信息
- bc_z:10,289,224 条样本
- fractal20220817_data:7,065,568 条样本
- bridge_dataset:3,746,792 条样本
- auxiliary_depth:1,500,000 条样本
- auxiliary_line:1,500,000 条样本
快速链接:
- 📂 [全部模型](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)
- 📂 [全部数据](https://huggingface.co/collections/allenai/molmoact-data-mixture-6897e583e13b6c2cf3ea2b80)
- 📃 [研究论文](https://arxiv.org/abs/2508.07917)
- 🎥 [官方博客](https://allenai.org/blog/molmoact)
- 🎥 [演示视频](https://youtu.be/-_wag1X25OE?si=Xi_kUaJTmcQBx1f6)
- [代码仓库](https://github.com/allenai/MolmoAct/tree/main)
## 许可与使用规范
本数据集采用CC BY-4.0许可协议进行授权,仅可用于符合[艾伦AI负责任使用指南](https://allenai.org/responsible-use)的研究与教育用途。本数据集基于来自[开放X实体](https://github.com/google-deepmind/open_x_embodiment?tab=readme-ov-file)的RT-1机器人动作、Berkeley Bridge与BC-Z数据集构建。文档中链接的所有其他数据集均需遵守其各自的使用许可协议。
## 引用格式
@misc{molmoact2025,
title={MolmoAct: Action Reasoning Models that can Reason in Space},
author={Jason Lee and Jiafei Duan and Haoquan Fang and Yuquan Deng and Shuo Liu and Boyang Li and Bohan Fang and Jieyu Zhang and Yi Ru Wang and Sangho Lee and Winson Han and Wilbert Pumacay and Angelica Wu and Rose Hendrix and Karen Farley and Eli VanderBilt and Ali Farhadi and Dieter Fox and Ranjay Krishna},
year={2025},
eprint={2508.07917},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2508.07917}
}
提供机构:
maas
创建时间:
2025-08-13



