AceCode-87K

Name: AceCode-87K
Creator: maas
Published: 2025-12-26 16:22:44
License: 暂无描述

魔搭社区2025-12-26 更新2025-02-15 收录

下载链接：

https://modelscope.cn/datasets/TIGER-Lab/AceCode-87K

下载链接

链接失效反馈

官方服务：

资源简介：

# 🂡 AceCode-87K [Paper](https://arxiv.org/abs/2502.01718) | [Github](https://github.com/TIGER-AI-Lab/AceCoder) | [AceCode-87K](https://huggingface.co/datasets/TIGER-Lab/AceCode-87K) | [AceCodePair-300K](https://huggingface.co/datasets/TIGER-Lab/AceCodePair-300K) | [RM/RL Models](https://huggingface.co/collections/TIGER-Lab/acecoder-67a16011a6c7d65cad529eba) We introduce AceCoder, the first work to propose a fully automated pipeline for synthesizing large-scale reliable tests used for the reward model training and reinforcement learning in the coding scenario. To do this, we curated the dataset AceCode-87K, where we start from a seed code dataset and prompt powerful LLMs to "imagine" proper test cases for the coding question and filter the noisy ones. We sample inferences from existing coder models and compute their pass rate as the reliable and verifiable rewards for both training the reward model and conducting the reinforcement learning for coder LLM. - **This dataset is the official AceCodeRM-87K after the test case filtering**. - Each question in the dataset is rewritten by GPT-4o-mini along with an average of **16** cleaned test cases. ![https://tiger-ai-lab.github.io/AceCoder/static/images/ac_overview.png](https://tiger-ai-lab.github.io/AceCoder/static/images/ac_overview.png) ## Data Formats - `Id` (str): Unique identifier for each question - `source` (str): which dataset - `question` (str): the question - `test_cases` (List[str]): test cases for the question - `inferences` (List[dict]): - `model_name` (str): the model name - `completion_id` (int): the completion id - `completion` (str): the completion - `pass_rate` (float): the pass rate - `test_results` (List[int]): the test results - `context_messages` (List[List[dict]]): context messages for the question - `content` (str): the content of the message - `role` (str): the role of the message ## Usage - **Direct use** ```python import datasets dataset = datasets.load_dataset("TIGER-Lab/AceCode-87K", split='train') ``` - **Use for RL tuning**: This dataset can be directly used for RL tuning with OpenRLHF codes, where you should set the `context_messages` as the key. ## Citation ```bibtex @article{AceCoder, title={AceCoder: Acing Coder RL via Automated Test-Case Synthesis}, author={Zeng, Huaye and Jiang, Dongfu and Wang, Haozhe and Nie, Ping and Chen, Xiaotong and Chen, Wenhu}, journal={ArXiv}, year={2025}, volume={abs/2207.01780} } ```

# 🂡 AceCode-87K [论文](https://arxiv.org/abs/2502.01718) | [GitHub仓库](https://github.com/TIGER-AI-Lab/AceCoder) | [AceCode-87K数据集](https://huggingface.co/datasets/TIGER-Lab/AceCode-87K) | [AceCodePair-300K数据集](https://huggingface.co/datasets/TIGER-Lab/AceCodePair-300K) | [奖励模型（Reward Model）/强化学习（Reinforcement Learning, RL）模型集合](https://huggingface.co/collections/TIGER-Lab/acecoder-67a16011a6c7d65cad529eba) 本研究提出AceCoder，这是首个针对编码场景下奖励模型训练与强化学习任务，构建大规模可靠测试用例的全自动化流水线框架。为此我们整理构建了AceCode-87K数据集：首先基于种子代码数据集，通过提示高性能大语言模型（Large Language Model, LLM）「构想」适配该编码问题的合理测试用例，并过滤其中存在噪声的无效用例；随后从现有代码生成模型中采样推理结果，计算其测试通过率，以此作为奖励模型训练与代码生成大语言模型强化学习的可靠可验证奖励信号。 - **本数据集为经过测试用例过滤后的官方AceCodeRM-87K版本**。 - 数据集中的每个编码问题均由GPT-4o-mini进行改写，并平均配套**16**个经过清洗的测试用例。 ![https://tiger-ai-lab.github.io/AceCoder/static/images/ac_overview.png](https://tiger-ai-lab.github.io/AceCoder/static/images/ac_overview.png) ## 数据格式 - `Id` (字符串)：每个编码问题的唯一标识符 - `source` (字符串)：所属数据集来源 - `question` (字符串)：原始编码问题文本 - `test_cases` (字符串列表)：适配该编码问题的测试用例集合 - `inferences` (字典列表)：现有代码生成模型的推理结果集 - `model_name` (字符串)：生成该推理结果的模型名称 - `completion_id` (整数)：该推理结果的唯一编号 - `completion` (字符串)：模型生成的代码补全内容 - `pass_rate` (浮点数)：该代码补全的测试通过率 - `test_results` (整数列表)：每条测试用例的运行结果 - `context_messages`（嵌套字典列表）：编码问题的上下文消息序列 - `content` (字符串)：消息的具体内容 - `role` (字符串)：消息发送者的角色 ## 使用方式 - **直接使用** python import datasets dataset = datasets.load_dataset("TIGER-Lab/AceCode-87K", split='train') - **用于强化学习微调**：本数据集可直接配合OpenRLHF代码库进行强化学习微调，需将`context_messages`作为关键词字段使用。 ## 引用格式 bibtex @article{AceCoder, title={AceCoder: Acing Coder RL via Automated Test-Case Synthesis}, author={Zeng, Huaye and Jiang, Dongfu and Wang, Haozhe and Nie, Ping and Chen, Xiaotong and Chen, Wenhu}, journal={ArXiv}, year={2025}, volume={abs/2207.01780} }

提供机构：

maas

创建时间：

2025-02-09

5,000+

优质数据集

54 个

任务类型

进入经典数据集