meta-agents-research-environments/gaia2_filesystem
收藏Hugging Face2025-09-10 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/meta-agents-research-environments/gaia2_filesystem
下载链接
链接失效反馈官方服务:
资源简介:
GAIA2是一个用于通用AI代理评估的基准文件系统数据集,由Meta AI研究团队发布。该数据集包含模拟场景,其中的用户数据、联系人、消息和交互都是虚构的,并伴有专业注释。数据集旨在填补现有基准在动态、时间感知和多代理协作场景方面的空白。它适用于AI代理能力的研究、多维度代理性能基准测试、多代理系统学术研究、AI助手的开发和评估以及代理架构的比较研究。
GAIA2 is a benchmark filesystem dataset for general AI agent evaluation, published by the Meta AI Research Team. The dataset includes simulated scenarios with fictional user data, contacts, messages, and interactions, extended with professional annotations. It aims to fill the gaps in existing benchmarks for dynamic, time-aware, and multi-agent collaborative scenarios. It is suitable for research on AI agent capabilities, benchmarking agent performance across multiple dimensions, academic research on multi-agent systems, development and evaluation of AI assistants, and comparative studies of agent architectures.
提供机构:
meta-agents-research-environments



