OpenAGI Tasks
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/agiresearch/OpenAGI
下载链接
链接失效反馈官方服务:
资源简介:
该数据集旨在为评估通用人工智能(AGI)能力而设计的一系列任务,它融合了标准的基准任务和开放性任务。该数据集支持包括零样本学习、少样本学习、微调以及强化学习在内的多种学习范式,从而能够全面评估大型语言模型的表现。在训练和测试中,任务的选择包括了随机抽取的10%用于训练,以及额外的10%用于测试。这些任务涉及到大型语言模型和特定领域专家模型的多步骤现实世界任务解决。
This dataset comprises a set of tasks designed to evaluate the capabilities of Artificial General Intelligence (AGI), integrating both standard benchmark tasks and open-ended tasks. It supports multiple learning paradigms including zero-shot learning, few-shot learning, fine-tuning, and reinforcement learning, enabling comprehensive performance evaluation of Large Language Models (LLMs). For training and testing workflows, 10% of the tasks are randomly sampled for training, while an additional 10% are reserved for testing. These tasks involve multi-step real-world task solving by both Large Language Models and domain-specific expert models.
提供机构:
OpenAGI Research
搜集汇总
数据集介绍

背景与挑战
背景概述
OpenAGI Tasks是一个用于AI智能体创建的Python包(pyopenagi),旨在帮助用户构建和管理AI智能体,特别关注将大语言模型(LLM)与领域专家知识相结合。它提供了智能体的添加、上传和下载功能,并基于研究论文'OpenAGI: When LLM Meets Domain Experts',支持开源贡献和MIT许可证。
以上内容由遇见数据集搜集并总结生成



