five

OLMo-Coding/starcoder-python-instruct

收藏
Hugging Face2025-09-17 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/OLMo-Coding/starcoder-python-instruct
下载链接
链接失效反馈
官方服务:
资源简介:
StarCoder-Python-Qwen-Instruct数据集包含Python代码样本和相应的合成自然语言指令,旨在用于监督微调语言模型,以便进行代码生成任务。数据集基于bigcode/starcoderdata语料库的Python部分,并利用Qwen模型生成指令。数据经过特定过滤和预处理,以适应有限上下文窗口的模型,如OLMo-2。

The StarCoder-Python-Qwen-Instruct dataset includes Python code samples paired with synthetically generated natural language instructions, designed for supervised fine-tuning of language models for code generation tasks. It is based on the Python subset of the bigcode/starcoderdata corpus, with instructions generated using the Qwen model. The data has been filtered and preprocessed to fit models with a limited context window, such as OLMo-2.
提供机构:
OLMo-Coding
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作