OLMo-Coding/starcoder-python-instruct
收藏Hugging Face2025-09-17 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/OLMo-Coding/starcoder-python-instruct
下载链接
链接失效反馈官方服务:
资源简介:
StarCoder-Python-Qwen-Instruct数据集包含Python代码样本和相应的合成自然语言指令,旨在用于监督微调语言模型,以便进行代码生成任务。数据集基于bigcode/starcoderdata语料库的Python部分,并利用Qwen模型生成指令。数据经过特定过滤和预处理,以适应有限上下文窗口的模型,如OLMo-2。
The StarCoder-Python-Qwen-Instruct dataset includes Python code samples paired with synthetically generated natural language instructions, designed for supervised fine-tuning of language models for code generation tasks. It is based on the Python subset of the bigcode/starcoderdata corpus, with instructions generated using the Qwen model. The data has been filtered and preprocessed to fit models with a limited context window, such as OLMo-2.
提供机构:
OLMo-Coding



