OTTER Dataset
收藏OTTER: 视觉-语言-动作模型
数据集描述
- 作者:Huang Huang, Fangchen Liu, Letian Fu, Tingfan Wu, Mustafa Mukadam, Jitendra Malik, Ken Goldberg, Pieter Abbeel
- 机构:UC Berkeley 和 Meta
- 论文:Otter: A Vision-Language-Action Model with Text-Aware Feature Extraciton
- 项目页面:OTTER Project Page
数据集版本
- 初始发布日期:2025-03-05
数据集获取
- 数据集存储在 Hugging Face 上,支持在 Open X-Embodiment 上进行预训练。
- LeRobot 版本的数据集及微调脚本:LeRobot Dataset、Pi0 Fine-tuning Scripts
数据集下载命令
bash pip install -U "huggingface_hub[cli]" mkdir -p dataset pushd dataset huggingface-cli download mlfu7/icrt_pour --repo-type dataset --local-dir . huggingface-cli download mlfu7/icrt_drawer --repo-type dataset --local-dir . huggingface-cli download mlfu7/icrt_poke --repo-type dataset --local-dir . huggingface-cli download mlfu7/icrt_pickplace_1 --repo-type dataset --local-dir . huggingface-cli download mlfu7/icrt_stack_mul_tfds --repo-type dataset --local-dir . huggingface-cli download mlfu7/icrt_pickplace --repo-type dataset --local-dir . popd
数据集使用许可
- Apache 2.0 许可
引用信息
@article{huang2025otter, title={Otter: A Vision-Language-Action Model with Text-Aware Feature Extraciton}, author={Huang Huang and Fangchen Liu and Letian Fu and Tingfan Wu and Mustafa Mukadam and Jitendra Malik and Ken Goldberg and Pieter Abbeel}, journal={arXiv preprint arXiv:2503.03734}, year={2025} }




