meta-agents-research-environments/gaia2
收藏Hugging Face2025-09-25 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/meta-agents-research-environments/gaia2
下载链接
链接失效反馈官方服务:
资源简介:
Gaia2是一个用于评估AI代理在模拟环境中能力的基准数据集。该数据集包含800个场景,用于测试代理在动态、时间感知和多代理协作环境中的性能。README文件详细介绍了数据集的特征,包括专注于不同能力的各种配置,如执行、搜索、适应性、时间和模糊性。它还描述了数据集的结构、如何使用它以及评估过程。数据集由Meta AI研究团队维护,并可在CC-by-4.0许可证下获得。
Gaia2 is a benchmark dataset for evaluating AI agent capabilities in simulated environments. The dataset includes 800 scenarios that test agent performance in dynamic, time-aware, and multi-agent collaborative settings. The README file details the datasets features, including various configurations that focus on different capabilities such as execution, search, adaptability, time, and ambiguity. It also describes the datasets structure, how to use it, and the evaluation process. The dataset is maintained by the Meta AI Research team and is available under the CC-by-4.0 license.
提供机构:
meta-agents-research-environments



