five

xingyaoww/code-act

收藏
Hugging Face2024-02-05 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/xingyaoww/code-act
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: default data_files: - split: codeact path: data/codeact-* - split: general path: data/general-* dataset_info: features: - name: id dtype: string - name: conversations list: - name: content dtype: string - name: role dtype: string splits: - name: codeact num_bytes: 34936511 num_examples: 7139 - name: general num_bytes: 250817144 num_examples: 71246 download_size: 123084833 dataset_size: 285753655 license: apache-2.0 task_categories: - text-generation language: - en tags: - llm-agent - llm - instruction-tuning size_categories: - 1K<n<10K --- <h1 align="center"> Executable Code Actions Elicit Better LLM Agents </h1> <p align="center"> <a href="https://github.com/xingyaoww/code-act">💻 Code</a> • <a href="https://arxiv.org/abs/2402.01030">📃 Paper</a> • <a href="https://huggingface.co/datasets/xingyaoww/code-act" >🤗 Data (CodeActInstruct)</a> • <a href="https://huggingface.co/xingyaoww/CodeActAgent-Mistral-7b-v0.1" >🤗 Model (CodeActAgent-Mistral-7b-v0.1)</a> • <a href="https://chat.xwang.dev/">🤖 Chat with CodeActAgent!</a> </p> We propose to use executable Python **code** to consolidate LLM agents’ **act**ions into a unified action space (**CodeAct**). Integrated with a Python interpreter, CodeAct can execute code actions and dynamically revise prior actions or emit new actions upon new observations (e.g., code execution results) through multi-turn interactions. ![Overview](https://github.com/xingyaoww/code-act/blob/main/figures/overview.png?raw=true) ## Why CodeAct? Our extensive analysis of 17 LLMs on API-Bank and a newly curated benchmark [M<sup>3</sup>ToolEval](docs/EVALUATION.md) shows that CodeAct outperforms widely used alternatives like Text and JSON (up to 20% higher success rate). Please check our paper for more detailed analysis! ![Comparison between CodeAct and Text/JSON](https://github.com/xingyaoww/code-act/blob/main/figures/codeact-comparison-table.png?raw=true) *Comparison between CodeAct and Text / JSON as action.* ![Comparison between CodeAct and Text/JSON](https://github.com/xingyaoww/code-act/blob/main/figures/codeact-comparison-perf.png?raw=true) *Quantitative results comparing CodeAct and {Text, JSON} on M<sup>3</sup>ToolEval.* ## 📁 CodeActInstruct We collect an instruction-tuning dataset CodeActInstruct that consists of 7k multi-turn interactions using CodeAct. Dataset is release at [huggingface dataset 🤗](https://huggingface.co/datasets/xingyaoww/code-act). Please refer to the paper and [this section](#-data-generation-optional) for details of data collection. ![Data Statistics](https://github.com/xingyaoww/code-act/blob/main/figures/data-stats.png?raw=true) *Dataset Statistics. Token statistics are computed using Llama-2 tokenizer.* ## 🪄 CodeActAgent Trained on **CodeActInstruct** and general conversaions, **CodeActAgent** excels at out-of-domain agent tasks compared to open-source models of the same size, while not sacrificing generic performance (e.g., knowledge, dialog). We release two variants of CodeActAgent: - **CodeActAgent-Mistral-7b-v0.1** (recommended, [model link](https://huggingface.co/xingyaoww/CodeActAgent-Mistral-7b-v0.1)): using Mistral-7b-v0.1 as the base model with 32k context window. - **CodeActAgent-Llama-7b** ([model link](https://huggingface.co/xingyaoww/CodeActAgent-Llama-2-7b)): using Llama-2-7b as the base model with 4k context window. ![Model Performance](https://github.com/xingyaoww/code-act/blob/main/figures/model-performance.png?raw=true) *Evaluation results for CodeActAgent. ID and OD stand for in-domain and out-of-domain evaluation correspondingly. Overall averaged performance normalizes the MT-Bench score to be consistent with other tasks and excludes in-domain tasks for fair comparison.* Please check out [our paper](TODO) and [code](https://github.com/xingyaoww/code-act) for more details about data collection, model training, and evaluation. ## 📚 Citation ```bibtex @misc{wang2024executable, title={Executable Code Actions Elicit Better LLM Agents}, author={Xingyao Wang and Yangyi Chen and Lifan Yuan and Yizhe Zhang and Yunzhu Li and Hao Peng and Heng Ji}, year={2024}, eprint={2402.01030}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```
提供机构:
xingyaoww
原始信息汇总

数据集概述

数据集配置

  • 默认配置
    • 数据文件
      • codeact:路径为 data/codeact-*
      • general:路径为 data/general-*

数据集信息

  • 特征

    • id:数据类型为字符串
    • conversations:列表类型,包含以下子特征:
      • content:数据类型为字符串
      • role:数据类型为字符串
  • 分割

    • codeact
      • 字节数:34936511
      • 样本数:7139
    • general
      • 字节数:250817144
      • 样本数:71246
  • 下载大小:123084833

  • 数据集大小:285753655

许可

  • 许可证:Apache 2.0

任务类别

  • 文本生成

语言

  • 英语

标签

  • llm-agent
  • llm
  • instruction-tuning

大小类别

  • 1K<n<10K
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作