Data from: A careful examination of large behavior models for multitask dexterous manipulation

Name: Data from: A careful examination of large behavior models for multitask dexterous manipulation
Creator: Dryad
Published: 2026-04-20 11:58:59
License: 暂无描述

DataCite Commons2026-04-20 更新2026-04-25 收录

下载链接：

https://datadryad.org/dataset/doi:10.5061/dryad.xd2547dxc

下载链接

链接失效反馈

官方服务：

资源简介：

Robot manipulation has seen tremendous progress in recent years, with imitation learning policies enabling successful performance of dexterous and hard-to-model tasks. Concurrently, scaling data and model size has led to the development of capable language and vision foundation models, motivating large-scale efforts to create general-purpose robot foundation models. While these models have garnered significant enthusiasm and investment, meaningful evaluation of real-world performance remains a challenge, limiting both the pace of development and inhibiting a nuanced understanding of current capabilities. In this paper, we rigorously evaluate multitask robot manipulation policies, referred to as Large Behavior Models (LBMs), by extending the Diffusion Policy paradigm across a corpus of simulated and real-world robot data. We propose and validate an evaluation pipeline to rigorously analyze the capabilities of these models with statistical confidence. We compare against single-task baselines through blind, randomized trials in a controlled setting, using both simulation and real-world experiments. We find that multitask pretraining makes the policies more successful and robust, and enables teaching.

近年来，机器人操作（robot manipulation）领域取得了长足进步，模仿学习（imitation learning）策略可实现灵巧且难以建模任务的成功执行。与此同时，数据与模型规模的拓展推动了高性能语言与视觉基础模型（foundation models）的发展，进而催生了打造通用机器人基础模型的大规模研究尝试。尽管这类模型收获了大量关注与投入，但对其真实世界性能的严谨评估仍是一项挑战，这既限制了研发进度，也阻碍了对当前模型能力的精细化理解。在本文中，我们通过将扩散策略（Diffusion Policy）范式扩展至模拟与真实机器人数据集合集，对被称为大行为模型（Large Behavior Models, LBMs）的多任务机器人操作策略开展了严格评估。我们提出并验证了一套评估流程，可借助统计置信度严谨分析这类模型的性能。我们在受控环境中通过盲法随机试验，结合仿真与真实世界实验，将其与单任务基准模型进行对比。我们发现，多任务预训练可提升策略的成功率与鲁棒性，并使其具备教学能力。

提供机构：

Dryad

创建时间：

2026-04-07

5,000+

优质数据集

54 个

任务类型

进入经典数据集