xingyaoww/code-act

Name: xingyaoww/code-act
Creator: xingyaoww
Published: 2024-02-05 05:23:24
License: 暂无描述

Hugging Face2024-02-05 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/xingyaoww/code-act

下载链接

链接失效反馈

官方服务：

资源简介：

--- configs: - config_name: default data_files: - split: codeact path: data/codeact-* - split: general path: data/general-* dataset_info: features: - name: id dtype: string - name: conversations list: - name: content dtype: string - name: role dtype: string splits: - name: codeact num_bytes: 34936511 num_examples: 7139 - name: general num_bytes: 250817144 num_examples: 71246 download_size: 123084833 dataset_size: 285753655 license: apache-2.0 task_categories: - text-generation language: - en tags: - llm-agent - llm - instruction-tuning size_categories: - 1K<n<10K --- <h1 align="center"> Executable Code Actions Elicit Better LLM Agents </h1> <p align="center"> <a href="https://github.com/xingyaoww/code-act">💻 Code</a> • <a href="https://arxiv.org/abs/2402.01030">📃 Paper</a> • <a href="https://huggingface.co/datasets/xingyaoww/code-act" >🤗 Data (CodeActInstruct)</a> • <a href="https://huggingface.co/xingyaoww/CodeActAgent-Mistral-7b-v0.1" >🤗 Model (CodeActAgent-Mistral-7b-v0.1)</a> • <a href="https://chat.xwang.dev/">🤖 Chat with CodeActAgent!</a> </p> We propose to use executable Python **code** to consolidate LLM agents’ **act**ions into a unified action space (**CodeAct**). Integrated with a Python interpreter, CodeAct can execute code actions and dynamically revise prior actions or emit new actions upon new observations (e.g., code execution results) through multi-turn interactions. ![Overview](https://github.com/xingyaoww/code-act/blob/main/figures/overview.png?raw=true) ## Why CodeAct? Our extensive analysis of 17 LLMs on API-Bank and a newly curated benchmark [M<sup>3</sup>ToolEval](docs/EVALUATION.md) shows that CodeAct outperforms widely used alternatives like Text and JSON (up to 20% higher success rate). Please check our paper for more detailed analysis! ![Comparison between CodeAct and Text/JSON](https://github.com/xingyaoww/code-act/blob/main/figures/codeact-comparison-table.png?raw=true) *Comparison between CodeAct and Text / JSON as action.* ![Comparison between CodeAct and Text/JSON](https://github.com/xingyaoww/code-act/blob/main/figures/codeact-comparison-perf.png?raw=true) *Quantitative results comparing CodeAct and {Text, JSON} on M<sup>3</sup>ToolEval.* ## 📁 CodeActInstruct We collect an instruction-tuning dataset CodeActInstruct that consists of 7k multi-turn interactions using CodeAct. Dataset is release at [huggingface dataset 🤗](https://huggingface.co/datasets/xingyaoww/code-act). Please refer to the paper and [this section](#-data-generation-optional) for details of data collection. ![Data Statistics](https://github.com/xingyaoww/code-act/blob/main/figures/data-stats.png?raw=true) *Dataset Statistics. Token statistics are computed using Llama-2 tokenizer.* ## 🪄 CodeActAgent Trained on **CodeActInstruct** and general conversaions, **CodeActAgent** excels at out-of-domain agent tasks compared to open-source models of the same size, while not sacrificing generic performance (e.g., knowledge, dialog). We release two variants of CodeActAgent: - **CodeActAgent-Mistral-7b-v0.1** (recommended, [model link](https://huggingface.co/xingyaoww/CodeActAgent-Mistral-7b-v0.1)): using Mistral-7b-v0.1 as the base model with 32k context window. - **CodeActAgent-Llama-7b** ([model link](https://huggingface.co/xingyaoww/CodeActAgent-Llama-2-7b)): using Llama-2-7b as the base model with 4k context window. ![Model Performance](https://github.com/xingyaoww/code-act/blob/main/figures/model-performance.png?raw=true) *Evaluation results for CodeActAgent. ID and OD stand for in-domain and out-of-domain evaluation correspondingly. Overall averaged performance normalizes the MT-Bench score to be consistent with other tasks and excludes in-domain tasks for fair comparison.* Please check out [our paper](TODO) and [code](https://github.com/xingyaoww/code-act) for more details about data collection, model training, and evaluation. ## 📚 Citation ```bibtex @misc{wang2024executable, title={Executable Code Actions Elicit Better LLM Agents}, author={Xingyao Wang and Yangyi Chen and Lifan Yuan and Yizhe Zhang and Yunzhu Li and Hao Peng and Heng Ji}, year={2024}, eprint={2402.01030}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```

提供机构：

xingyaoww

原始信息汇总

数据集概述

数据集配置

默认配置：
- 数据文件：
  - codeact：路径为 data/codeact-*
  - general：路径为 data/general-*

数据集信息

特征：
- id：数据类型为字符串
- conversations：列表类型，包含以下子特征：
  - content：数据类型为字符串
  - role：数据类型为字符串
分割：
- codeact：
  - 字节数：34936511
  - 样本数：7139
- general：
  - 字节数：250817144
  - 样本数：71246
下载大小：123084833
数据集大小：285753655

许可

许可证：Apache 2.0

任务类别

文本生成

语言

英语

大小类别

1K<n<10K

5,000+

优质数据集

54 个

任务类型

进入经典数据集