ALFRED

Name: ALFRED
Creator: 华盛顿大学和艾伦人工智能研究所
Published: 2020-03-31 00:00:00
License: 暂无描述

askforalfred.com2020-03-31 更新2025-02-19 收录

下载链接：

https://askforalfred.com/

下载链接

链接失效反馈

官方服务：

资源简介：

ALFRED 数据集由华盛顿大学和艾伦人工智能研究所联合创建，旨在为家庭任务提供从自然语言指令和视觉输入到动作序列的映射基准。该数据集包含 25,743 条自然语言指令，涵盖 8,055 个专家演示，平均每个任务包含 50 个动作步骤，生成 428,322 个图像-动作对。指令分为高层次目标（如“将杯子放入咖啡机”）和低层次步骤（如“走向右侧的咖啡机”）。数据集基于 AI2-THOR 2.0 模拟器构建，涉及部分可观测性、长动作序列、不可逆动作以及视觉交互掩码的生成。ALFRED 数据集的创建通过规划器生成专家演示，并结合众包平台收集自然语言指令。其应用领域包括视觉语言导航、家庭机器人任务规划以及语言驱动的机器人行为学习，旨在缩小现有基准与现实世界应用之间的差距。

ALFRED dataset was co-developed by the University of Washington and the Allen Institute for Artificial Intelligence. It is designed as a benchmark for mapping from natural language instructions and visual inputs to action sequences for household tasks. This dataset includes 25,743 natural language instructions, covering 8,055 expert demonstrations, with an average of 50 action steps per task and generating a total of 428,322 image-action pairs. The instructions are divided into two categories: high-level goals (e.g., "Place the cup into the coffee maker") and low-level steps (e.g., "Walk towards the coffee maker on the right"). Built on the AI2-THOR 2.0 simulator, the dataset encompasses scenarios with partial observability, long action sequences, irreversible actions, and the generation of visual interaction masks. The development of ALFRED is implemented by generating expert demonstrations via planning modules and collecting natural language instructions through crowdsourcing platforms. Its application domains include vision-language navigation, household robot task planning, and language-driven robot behavior learning, aiming to bridge the gap between existing benchmarks and real-world applications.

提供机构：

华盛顿大学和艾伦人工智能研究所

创建时间：

2020-03-31

搜集汇总

数据集介绍