MAT-THOR

arXiv2025-09-30 收录

下载链接：

https://lamma-p.github.io

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为MAT-THOR，是一个从SMART-LLM基准扩展而来的多智能体长视野任务数据集，旨在评估基于AI2-THOR模拟器的LaMMA-P及基准方法。该数据集涵盖了五个楼层平面图的70项任务，这些任务被分为三个复杂度级别：复合任务、复杂任务和模糊指令任务。此外，数据集还包括详细的任务信息，如初始状态、机器人技能以及成功的最终条件。它支持使用两到四个技能各异的机器人进行测试，并包含成功率（SR）、目标条件回忆（GCR）、机器人利用率（RU）、可执行性（Exe）和效率（Eff）等评估指标。该数据集的规模覆盖了五个楼层平面图的70项任务，其研究主题是针对协作异构机器人团队的任务分配与执行效率。

This dataset, named MAT-THOR, is a multi-agent long-horizon task dataset extended from the SMART-LLM benchmark, aiming to evaluate LaMMA-P and baseline methods based on the AI2-THOR simulator. The dataset covers 70 tasks across five floor plans, which are categorized into three complexity levels: composite tasks, complex tasks, and ambiguous instruction tasks. Additionally, the dataset includes detailed task information such as initial states, robot skills, and successful final conditions. It supports testing with 2 to 4 robots with diverse skills, and contains evaluation metrics including Success Rate (SR), Goal Condition Recall (GCR), Robot Utilization (RU), Executability (Exe), and Efficiency (Eff). With a scope of 70 tasks across five floor plans, this dataset focuses on task allocation and execution efficiency for collaborative heterogeneous robot teams.

搜集汇总

数据集介绍

背景与挑战

背景概述

MAT-THOR是一个基于AI2-THOR模拟器的多智能体复杂长时程任务基准数据集，用于评估多智能体规划方法的性能。数据集包含不同复杂度的任务，如复合任务、复杂任务和模糊指令任务，与LaMMA-P框架结合使用时表现出色。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集