five

FelixFester/CodeFeedback-Filtered-Instruction

收藏
Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/FelixFester/CodeFeedback-Filtered-Instruction
下载链接
链接失效反馈
官方服务:
资源简介:
CodeFeedback-Filtered-Instruction是一个精选的代码指令查询集合,提取自四个著名的开源代码指令调优数据集:Magicoder-OSS-Instruct、Python code subset of ShareGPT、Magicoder-Evol-Instruct和Evol-Instruct-Code。最初从这些数据集中聚合了287k查询,通过使用Qwen-72B-Chat开源聊天模型进行选择性过滤,仅保留了复杂度评分为4或5的查询,最终形成了156k高质量单轮代码指令的集合。在后续处理步骤中,除了单轮打包外,仅使用了查询而不考虑响应,但在此数据集中保留了所有响应以便用户更方便地使用。

CodeFeedback-Filtered-Instruction is a curated collection of code instruction queries extracted from four prominent open-source code instruction tuning datasets: Magicoder-OSS-Instruct, Python code subset of ShareGPT, Magicoder-Evol-Instruct, and Evol-Instruct-Code. Initially, 287k queries were aggregated from these datasets. To isolate the most intricate and informative instructions, a rigorous filtering process was employed using the Qwen-72B-Chat open-source chat model, retaining only those rated 4 or 5 in complexity, resulting in a final collection of 156k high-quality single-turn code instructions. In subsequent processing steps, besides Single-turn Packing, only queries were used without considering responses, but here all responses were retained to provide users with more convenient usage options.
提供机构:
FelixFester
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作