rombodawg/code_bagel_hermes-2.5
收藏Hugging Face2024-10-08 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/rombodawg/code_bagel_hermes-2.5
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是code_bagel和Open-Hermes-2.5两个数据集的合并,包含90万行高质量的非代码指令数据和300万行高质量的代码指令数据。每行最多支持10,000个标记,支持超过100种编程语言。数据集经过去重和去审查处理,旨在成为终极的代码微调数据集,能够处理几乎任何任务。
This dataset is a combined version of the code_bagel and Open-Hermes-2.5 datasets, containing 900k lines of high-quality non-code instruction data and 3 million lines of high-quality coding instruction data. Each line has a maximum of 10,000 tokens and supports over 100 programming languages. The dataset was created by merging and deduplicating the two original datasets, aiming to be the ultimate coding fine-tuning dataset.
提供机构:
rombodawg



