CMU Stretch

Name: CMU Stretch
Creator: OpenDataLab
Published: 2026-05-17 06:30:47
License: 暂无描述

OpenDataLab2026-05-17 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/CMU_Stretch

下载链接

链接失效反馈

官方服务：

资源简介：

建造一个能够通过观察人类来理解并学习互动的机器人引发了一些视觉问题。然而，尽管在静态数据集上取得了一些成功的结果，但目前仍不清楚如何将当前模型直接用于机器人。在本文中，我们的目标是通过以环境为中心的方式利用人类互动的视频来弥合这一差距。利用人类行为的互联网视频，我们训练了一个视觉可供性模型，该模型估计人类可能在场景中的位置和方式进行交互。这些行为可供性的结构直接使机器人能够执行许多复杂的任务。我们展示了如何将我们的可供性模型与四种机器人学习范例无缝集成，包括离线模仿学习、探索、目标条件学习和强化学习的动作参数化。我们展示了我们的方法的有效性，我们将其称为视觉机器人桥 (VRB)，因为我们的目标是将计算机视觉技术与机器人操作无缝集成，跨越 4 个现实世界环境、10 多个不同的任务以及 2 个在野外运行的机器人平台。

Building robots that can understand and learn to interact by observing humans presents multiple visual challenges. While some successful results have been achieved on static datasets, it remains unclear how to directly apply current models to robotic systems. In this paper, we aim to bridge this gap by leveraging human interaction videos in an environment-centric manner. We utilize internet videos of human behaviors to train a visual affordance model that estimates where and how humans might interact in a given scene. The structure of these behavioral affordances directly enables robots to perform a wide range of complex tasks. We demonstrate how our affordance model can be seamlessly integrated with four robot learning paradigms, including offline imitation learning, exploration, goal-conditioned learning, and action parametrization for reinforcement learning. We validate the effectiveness of our proposed method, which we refer to as Visual Robot Bridge (VRB), as our core goal is to seamlessly integrate computer vision techniques with robotic manipulation across 4 real-world environments, over 10 distinct tasks, and 2 robot platforms operating in the wild.

提供机构：

OpenDataLab

创建时间：

2023-10-20

搜集汇总

数据集介绍