OpenAGI Tasks
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/agiresearch/OpenAGI
下载链接
链接失效反馈官方服务:
资源简介:
该数据集旨在为评估通用人工智能(AGI)能力而设计的一系列任务,它融合了标准的基准任务和开放性任务。该数据集支持包括零样本学习、少样本学习、微调以及强化学习在内的多种学习范式,从而能够全面评估大型语言模型的表现。在训练和测试中,任务的选择包括了随机抽取的10%用于训练,以及额外的10%用于测试。这些任务涉及到大型语言模型和特定领域专家模型的多步骤现实世界任务解决。
This dataset comprises a set of tasks designed to evaluate the capabilities of Artificial General Intelligence (AGI), integrating both standard benchmark tasks and open-ended tasks. It supports multiple learning paradigms including zero-shot learning, few-shot learning, fine-tuning, and reinforcement learning, enabling comprehensive performance evaluation of Large Language Models (LLMs). For training and testing workflows, 10% of the tasks are randomly sampled for training, while an additional 10% are reserved for testing. These tasks involve multi-step real-world task solving by both Large Language Models and domain-specific expert models.
提供机构:
OpenAGI Research



