five

DistilQwen_100k

收藏
魔搭社区2026-01-07 更新2025-03-01 收录
下载链接:
https://modelscope.cn/datasets/PAI/DistilQwen_100k
下载链接
链接失效反馈
官方服务:
资源简介:
To support community developers in avoiding the phenomenon of "catastrophic forgetting" when fine-tuning the DistilQwen2.5 model, we have open-sourced a portion of the dataset used for model training. These datasets are designed to provide a solid foundation for model fine-tuning, helping to enhance the model's adaptability to new tasks while maintaining its performance on previous ones. The released data covers various domains, including mathematics, coding, knowledge-based Q&A, instruction following, and creative generation, with a total volume of 10k samples. When fine-tuning the model with their own data, users can incorporate DistilQwen_100k to ensure strong performance on downstream tasks without compromising the model's general capabilities, thereby preserving its generalization ability. ## Reference For more detailed information about the dataset construction process, we encourage you to refer to our paper: - **DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models** Chengyu Wang, Junbing Yan, Yuanhao Yue, Jun Huang [arXiv:2504.15027](https://arxiv.org/abs/2504.15027) You can cite the paper using the following citation format: ```bibtex @misc{wang2025distilqwen25industrialpracticestraining, title={DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models}, author={Chengyu Wang and Junbing Yan and Yuanhao Yue and Jun Huang}, year={2025}, eprint={2504.15027}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.15027} } ```

为助力社区开发者在微调DistilQwen2.5模型时规避"灾难性遗忘"现象,我们开源了部分模型训练所用的数据集。 本数据集旨在为模型微调提供坚实支撑,帮助提升模型对新任务的适配能力,同时保留其在既往任务上的性能表现。 本次发布的数据覆盖数学、编程、知识问答、指令遵循与创意生成等多个领域,总计包含10000条样本。 用户在使用自有数据微调模型时,可结合DistilQwen_100k数据集开展训练,以确保模型在下游任务中表现优异,且不会损害其通用能力,进而保留模型的泛化性能。 ## 参考资料 如需了解该数据集构建流程的更多细节,敬请参阅我们的论文: - **DistilQwen2.5:面向蒸馏开源轻量级大语言模型的工业实践** 王成宇(Chengyu Wang)、严俊冰(Junbing Yan)、岳元昊(Yuanhao Yue)、黄俊(Jun Huang) [arXiv:2504.15027](https://arxiv.org/abs/2504.15027) 您可通过以下引用格式引用该论文: bibtex @misc{wang2025distilqwen25industrialpracticestraining, title={DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models}, author={Chengyu Wang and Junbing Yan and Yuanhao Yue and Jun Huang}, year={2025}, eprint={2504.15027}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.15027} }
提供机构:
maas
创建时间:
2025-02-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作