five

nphearum/Codex-Reasoning-4000x-filtered

收藏
Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/nphearum/Codex-Reasoning-4000x-filtered
下载链接
链接失效反馈
官方服务:
资源简介:
Codex-Reasoning是一个精心策划的编码数据集,专门用于基于指令的模型调优和现有模型的微调,以增强代码生成和推理能力。这个完全合成的数据集代表了Hugging Face平台上经过全面过滤的大型编码数据语料库,强调通过逐步推理的思维方式进行更深入的模型训练。数据集包含4000个经过高度筛选的编码示例,覆盖从基本语法到高级软件工程的广泛编程领域。通过多阶段过滤和验证过程,包括基于排名的过滤和专家选择,确保数据质量。每个示例包含问题陈述、逐步解决方案和最终可执行代码。数据集适用于代码生成和推理能力的微调、具有编码和推理重点的指令遵循模型训练、编码任务和逻辑推理的模型性能基准测试、AI辅助编程和可解释AI的研究,以及需要逐步代码解释和推理的教育应用。

Codex-Reasoning is a meticulously curated coding dataset designed specifically for instruction-based model tuning and fine-tuning of existing models with enhanced code generation and reasoning capabilities. This fully synthetic dataset represents a large and comprehensively filtered corpus of coding data on the Hugging Face platform, emphasizing a thinking approach with step-by-step reasoning for deeper model training. The dataset features 4000x examples of highly curated coding data, covering a wide range of programming domains from basic syntax to advanced software engineering. It undergoes multi-stage filtering and verification processes, including ranking-based filtering and expert selections, to ensure high quality. Each example includes a problem statement, step-by-step solution, and final executable code. The dataset is ideal for fine-tuning code generation and reasoning capabilities, training instruction-following models with a coding and reasoning focus, benchmarking model performance on coding tasks and logical reasoning, researching AI-assisted programming and explainable AI, and educational applications requiring step-by-step code explanations and reasoning.
提供机构:
nphearum
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作