AceCode-89K

Name: AceCode-89K
Creator: maas
Published: 2025-12-04 16:22:18
License: 暂无描述

魔搭社区2025-12-04 更新2025-02-08 收录

下载链接：

https://modelscope.cn/datasets/TIGER-Lab/AceCode-89K

下载链接

链接失效反馈

官方服务：

资源简介：

# 🂡 AceCode-87K [Paper](https://arxiv.org/abs/2502.01718) | [Github](https://github.com/TIGER-AI-Lab/AceCoder) | [AceCode-87K](https://huggingface.co/datasets/TIGER-Lab/AceCode-87K) | [AceCodePair-300K](https://huggingface.co/datasets/TIGER-Lab/AceCodePair-300K) | [RM/RL Models](https://huggingface.co/collections/TIGER-Lab/acecoder-67a16011a6c7d65cad529eba) We introduce AceCoder, the first work to propose a fully automated pipeline for synthesizing large-scale reliable tests used for the reward model training and reinforcement learning in the coding scenario. To do this, we curated the dataset AceCode-87K, where we start from a seed code dataset and prompt powerful LLMs to "imagine" proper test cases for the coding question and filter the noisy ones. We sample inferences from existing coder models and compute their pass rate as the reliable and verifiable rewards for both training the reward model and conducting the reinforcement learning for coder LLM. - **This dataset is the official AceCodeRM-87K after the test case filtering**. - Each question in the dataset is rewritten by GPT-4o-mini along with an average of **16** cleaned test cases. ![https://tiger-ai-lab.github.io/AceCoder/static/images/ac_overview.png](https://tiger-ai-lab.github.io/AceCoder/static/images/ac_overview.png) ## Data Formats - `Id` (str): Unique identifier for each question - `source` (str): which dataset - `question` (str): the question - `test_cases` (List[str]): test cases for the question - `inferences` (List[dict]): - `model_name` (str): the model name - `completion_id` (int): the completion id - `completion` (str): the completion - `pass_rate` (float): the pass rate - `test_results` (List[int]): the test results - `context_messages` (List[List[dict]]): context messages for the question - `content` (str): the content of the message - `role` (str): the role of the message ## Usage - **Direct use** ```python import datasets dataset = datasets.load_dataset("TIGER-Lab/AceCode-87K", split='train') ``` - **Use for RL tuning**: This dataset can be directly used for RL tuning with OpenRLHF codes, where you should set the `context_messages` as the key. ## Citation ```bibtex @article{AceCoder, title={AceCoder: Acing Coder RL via Automated Test-Case Synthesis}, author={Zeng, Huaye and Jiang, Dongfu and Wang, Haozhe and Nie, Ping and Chen, Xiaotong and Chen, Wenhu}, journal={ArXiv}, year={2025}, volume={abs/2207.01780} } ```

# 🂡 AceCode-87K [论文](https://arxiv.org/abs/2502.01718) | [GitHub仓库](https://github.com/TIGER-AI-Lab/AceCoder) | [AceCode-87K数据集](https://huggingface.co/datasets/TIGER-Lab/AceCode-87K) | [AceCodePair-300K数据集](https://huggingface.co/datasets/TIGER-Lab/AceCodePair-300K) | [奖励模型/强化学习模型](https://huggingface.co/collections/TIGER-Lab/acecoder-67a16011a6c7d65cad529eba) 我们提出AceCoder，这是首个提出全自动化流水线的工作，用于合成大规模可靠测试用例，以服务于编码场景下的奖励模型训练与强化学习。为此，我们构建并精选了AceCode-87K数据集：我们从种子代码数据集出发，提示高性能大语言模型（Large Language Model, LLM）“构思”适配对应编码问题的合理测试用例，并过滤其中的噪声样本。我们从现有代码模型的推理结果中进行采样，并计算其通过率，以此作为可靠且可验证的奖励信号，既用于训练奖励模型，也用于对代码大语言模型开展强化学习。 - **本数据集为经过测试用例过滤后的官方AceCodeRM-87K**。 - 数据集中的每个问题均由GPT-4o-mini重写，且平均配套**16**个经过清洗的测试用例。 ![https://tiger-ai-lab.github.io/AceCoder/static/images/ac_overview.png](https://tiger-ai-lab.github.io/AceCoder/static/images/ac_overview.png) ## 数据格式 - `"Id"`（字符串类型）：每个问题的唯一标识符 - `"source"`（字符串类型）：所属数据集来源 - `"question"`（字符串类型）：编码问题描述 - `"test_cases"`（列表[字符串类型]）：对应问题的测试用例列表 - `"inferences"`（列表[字典类型]）：推理结果列表，其中每个字典包含： - `"model_name"`（字符串类型）：生成代码所用模型的名称 - `"completion_id"`（整数类型）：生成结果的编号 - `"completion"`（字符串类型）：模型生成的代码补全结果 - `"pass_rate"`（浮点类型）：该生成结果的通过率 - `"test_results"`（列表[整数类型]）：测试结果列表 - `"context_messages"`（列表[列表[字典类型]]）：对应问题的上下文消息列表，其中每个字典包含： - `"content"`（字符串类型）：消息的具体内容 - `"role"`（字符串类型）：消息的发送角色 ## 使用方式 - **直接使用** python import datasets dataset = datasets.load_dataset("TIGER-Lab/AceCode-87K", split='train') - **用于强化学习调优**：本数据集可直接结合OpenRLHF代码开展强化学习调优，需将`"context_messages"`作为键进行配置。 ## 引用 bibtex @article{AceCoder, title={AceCoder: Acing Coder RL via Automated Test-Case Synthesis}, author={Zeng, Huaye and Jiang, Dongfu and Wang, Haozhe and Nie, Ping and Chen, Xiaotong and Chen, Wenhu}, journal={ArXiv}, year={2025}, volume={abs/2207.01780} }

提供机构：

maas

创建时间：

2025-02-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集