Confucius

github2025-02-09 更新2025-02-10 收录

下载链接：

https://github.com/mangopy/Confucius-tool-learning

下载链接

链接失效反馈

官方服务：

资源简介：

Confucius是一个用于训练大型语言模型（LLM）使用复杂工具的数据集，它包含了多阶段学习方法和迭代自我指导从内省反馈技术构建的示例。

Confucius is a dataset developed for training large language models (LLMs) to leverage complex tools. It encompasses examples constructed via a multi-stage learning approach and iterative self-instruction paired with introspective feedback techniques.

创建时间：

2025-02-03

原始信息汇总

Confucius-tool-learning 数据集概述

数据集简介

Confucius是一个用于训练大型语言模型（LLM）使用外部工具的项目，旨在通过迭代工具学习从内省反馈中，按照由易到难的课程进行学习，提高LLM在现实世界场景中使用复杂工具的能力。

数据集结构

数据集包含以下字段：

api: 解决特定任务的API。
number: 调用API的次数。
prompt: 生成示例的提示。
task: 任务名称。
question: 基于API的特定查询。
_answer: 以链式思维（COT）格式解决问题的解决方案，其中上述API被回调。

具体示例： json { "api": [ [ "CAL", "expression: 2500/5", "CAL(expression: e)->float: calculate the result of expression e, e.g. 1+2, 1/3, 45 and 7-1." ], [ "CAL", "expression: 2%s1", "CAL(expression: e)->float: calculate the result of expression e, e.g. 1+2, 1/3, 45 and 7-1." ], [ "CAL", "expression: %s2-200", "CAL(expression: e)->float: calculate the result of expression e, e.g. 1+2, 1/3, 45 and 7-1." ] ], "number": 3, "prompt": "According to the ratio, for every 5 parts that Johnson gets, Mike gets 2 parts.Since Johnson got $2500, each part is therefore $2500/5 = $<<2500/5=500>>500.Mike will get 2*$500 = $<<2500=1000>>1000.After buying the shirt he will have $1000-$200 = $<<1000-200=800>>800 left. ### 800", "question": "The profit from a business transaction is shared among 2 business partners, Mike and Johnson in the ratio 2:5 respectively. If Johnson got $2500, how much will Mike have after spending some of his share on a shirt that costs $200?", "_answer": "According to the ratio, for every 5 parts that Johnson gets, Mike gets 2 parts. Since Johnson got $2500, each part is therefore [CAL(2500/5) -> %s1].Mike will get 2$%s1 = [CAL(2*%s1) -> %s2]. After buying the shirt, he will have $%s2-$200 = [CAL(%s2-200) -> %s3] left. ### 800", "task": "calculation" }

数据集下载

数据集已共享在Google Drive上，提供不同规模（small、middle、large）的训练数据集。

数据集引用

@inproceedings{gao2023confucius, title={Confucius: Iterative tool learning from introspection feedback by easy-to-difficult curriculum}, author={Gao, Shen and Shi, Zhengliang and Zhu, Minghang and Fang, Bowen and Xin, Xin and Ren, Pengjie and Chen, Zhumin and Ma, Jun}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, year={2024} }

搜集汇总

数据集介绍

构建方式

本研究团队提出了一种名为Confucius的工具学习框架，其构建方式主要包含两个阶段：首先，采用多阶段学习法，按易到难的教程顺序教授LLM使用各种工具；其次，提出了一种迭代自我指导法，通过内省反馈动态构建数据集，以提升使用复杂工具的能力。

使用方法

使用Confucius数据集，用户首先需要从提供的链接中下载不同规模的数据集，并根据实验需求移除测试集中的工具，以评估模型的泛化能力。训练时，用户需配置相应的环境，使用pytorch-lightning和transformers库进行模型训练，同时可利用deepspeed库加速训练过程。

背景与挑战

背景概述

Confucius数据集的研究背景源于大型语言模型（LLM）与外部工具结合以提高其在实际任务规划和工具调用方面的能力。该数据集由Gao Shen等研究人员提出，旨在通过'由易到难课程中的自省反馈迭代工具学习'方法，训练LLM使用现实世界场景中的复杂工具。该研究在2023年12月20日被AAAI 2023会议接受，并在2024年1月31日代码开源，3月1日数据集上传可供下载。

当前挑战

该数据集面临的挑战主要包括：1) 如何在庞大的工具集中选择合适的工具，这对于工具学习模型在现实世界应用至关重要；2) 构建过程中的挑战，如多阶段学习方法的实现和自省反馈迭代技术的应用，以提高模型使用复杂工具的能力。此外，数据集构建者还需解决模型在未见过的工具设置中的泛化性问题。

常用场景

经典使用场景

Confucius数据集致力于提升大型语言模型(LLM)在实际任务规划与工具调用中的能力。其经典使用场景在于，通过易于理解到复杂度逐渐增加的教程，训练LLM使用各种工具，并通过迭代自我指导与内省反馈，不断优化模型对复杂工具的使用能力。

解决学术问题

该数据集解决了传统工具学习方法中，模型仅能在受控环境下学习执行人类提供的工具，而无法从大量工具集中选择适当工具的问题。通过引入难度递增的教学法和迭代自我指导机制，Confucius提高了模型在现实世界场景中处理复杂任务的能力，为学术研究提供了新的视角和方法。

实际应用

在实用层面，Confucius数据集的应用可以帮助改进智能体的决策过程，使其在处理如资源分配、财务规划等复杂任务时，能够更有效地调用适当的工具，进而提高任务解决的准确性和效率。

数据集最近研究