five

OpenCoder-LLM/opc-annealing-corpus

收藏
Hugging Face2025-05-29 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/OpenCoder-LLM/opc-annealing-corpus
下载链接
链接失效反馈
官方服务:
资源简介:
opc-annealing-corpus是OpenCoder数据集中的一个附加组件,用于退火阶段。它包含三个主要部分:algorithmic_corpus、synthetic_code_snippet和synthetic_qa。algorithmic_corpus是从The Stack v2中采样的算法相关代码,synthetic_code_snippet是通过重写algorithmic_corpus生成的高质量代码片段,synthetic_qa是通过改编algorithmic_corpus生成的高质量问答对。这些数据在OpenCoder的退火阶段被使用,并通过消融实验验证了其有效性。

The opc-annealing-corpus is an additional component incorporated into OpenCoder during the annealing phase. It consists of three main parts: algorithmic_corpus, synthetic_code_snippet, and synthetic_qa. The algorithmic_corpus is algorithm-related code sampled from The Stack v2, synthetic_code_snippet is high-quality code snippets generated by rewriting the algorithmic_corpus, and synthetic_qa is high-quality Q&A pairs generated by adapting the algorithmic_corpus. These data are used in the annealing phase of OpenCoder, and their effectiveness has been validated through ablation experiments.
提供机构:
OpenCoder-LLM
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作