CMU Franka Exploration

Name: CMU Franka Exploration
Creator: OpenDataLab
Published: 2026-05-24 06:30:48
License: 暂无描述

OpenDataLab2026-05-24 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/CMU_Franka_Exploration

下载链接

链接失效反馈

官方服务：

资源简介：

在本文中，我们解决了直接在现实世界中学习复杂、一般行为的问题。我们提出了一种方法，让机器人仅使用来自许多不同设置的少量现实世界交互轨迹即可有效地学习操作技能。受到计算机视觉和自然语言领域大规模数据集学习成功的启发，我们相信，为了有效学习，机器人必须能够利用互联网规模的人类视频数据。人类以许多有趣的方式与世界互动，这可以让机器人不仅了解有用的行为和可供性，还了解这些行为如何影响世界以进行操纵。我们的方法构建了一个结构化的、以人为中心的动作空间，其基础是从人类视频中学到的视觉可供性。此外，我们在人类视频上训练世界模型，并在没有任何任务监督的情况下对少量机器人交互数据进行微调。我们证明，这种可供性空间世界模型的方法使不同的机器人能够在 30 分钟的交互内学习复杂环境中的各种操作技能。

In this paper, we address the problem of learning complex, general behaviors directly in the real world. We propose an approach that enables robots to efficiently learn manipulation skills using only a small number of real-world interaction trajectories across diverse settings. Motivated by the success of large-scale dataset learning in computer vision and natural language processing, we posit that for robots to learn effectively, they must be able to leverage internet-scale human video data. Humans interact with the world in a multitude of interesting ways, which can enable robots to not only learn about useful behaviors and affordances, but also understand how these behaviors affect the world for manipulation. Our approach constructs a structured, human-centric action space grounded in visual affordances learned from human videos. Furthermore, we train a world model on human videos and fine-tune it on sparse robot interaction data without any task supervision. We demonstrate that this affordance-space world model approach enables different robots to learn a variety of manipulation skills in complex environments within 30 minutes of interaction.

提供机构：

OpenDataLab

创建时间：

2023-10-20

搜集汇总

数据集介绍