SIMA Evaluation Dataset

Name: SIMA Evaluation Dataset
Creator: SIMA Research Team
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://deepmind.google/discover/blog/sima-generalist-ai-agent-for-3d-virtual-environments/

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集用于评估SIMA代理在多种模拟3D环境中的表现，包括研究环境和商业视频游戏。数据集包含一个基于自然语言的评估任务聚类框架，涵盖了不同的技能类别以及在各种环境中的表现评估。该数据集的规模涉及多个环境，在多次训练运行中评估了各种任务，其任务旨在对代理在不同环境和技能类别中的表现进行评估。

This dataset is intended to evaluate the performance of SIMA Agents across a range of simulated 3D environments, including research testbeds and commercial video games. It features a natural language-based framework for clustering evaluation tasks, which encompasses diverse skill categories and performance assessments across various environments. Spanning multiple environments, the dataset evaluates diverse tasks across multiple training runs, with the core objective of benchmarking the agents' performance across different environments and skill categories.

提供机构：

SIMA Research Team

5,000+

优质数据集

54 个

任务类型

进入经典数据集