DistilQwen_100k

Name: DistilQwen_100k
Creator: maas
Published: 2026-01-07 13:26:19
License: 暂无描述

魔搭社区2026-01-07 更新2025-03-01 收录

下载链接：

https://modelscope.cn/datasets/PAI/DistilQwen_100k

下载链接

链接失效反馈

官方服务：

资源简介：

To support community developers in avoiding the phenomenon of "catastrophic forgetting" when fine-tuning the DistilQwen2.5 model, we have open-sourced a portion of the dataset used for model training. These datasets are designed to provide a solid foundation for model fine-tuning, helping to enhance the model's adaptability to new tasks while maintaining its performance on previous ones. The released data covers various domains, including mathematics, coding, knowledge-based Q&A, instruction following, and creative generation, with a total volume of 10k samples. When fine-tuning the model with their own data, users can incorporate DistilQwen_100k to ensure strong performance on downstream tasks without compromising the model's general capabilities, thereby preserving its generalization ability. ## Reference For more detailed information about the dataset construction process, we encourage you to refer to our paper: - **DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models** Chengyu Wang, Junbing Yan, Yuanhao Yue, Jun Huang [arXiv:2504.15027](https://arxiv.org/abs/2504.15027) You can cite the paper using the following citation format: ```bibtex @misc{wang2025distilqwen25industrialpracticestraining, title={DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models}, author={Chengyu Wang and Junbing Yan and Yuanhao Yue and Jun Huang}, year={2025}, eprint={2504.15027}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.15027} } ```

为助力社区开发者在微调DistilQwen2.5模型时规避"灾难性遗忘"现象，我们开源了部分模型训练所用的数据集。本数据集旨在为模型微调提供坚实支撑，帮助提升模型对新任务的适配能力，同时保留其在既往任务上的性能表现。本次发布的数据覆盖数学、编程、知识问答、指令遵循与创意生成等多个领域，总计包含10000条样本。用户在使用自有数据微调模型时，可结合DistilQwen_100k数据集开展训练，以确保模型在下游任务中表现优异，且不会损害其通用能力，进而保留模型的泛化性能。 ## 参考资料如需了解该数据集构建流程的更多细节，敬请参阅我们的论文： - **DistilQwen2.5：面向蒸馏开源轻量级大语言模型的工业实践** 王成宇（Chengyu Wang）、严俊冰（Junbing Yan）、岳元昊（Yuanhao Yue）、黄俊（Jun Huang） [arXiv:2504.15027](https://arxiv.org/abs/2504.15027) 您可通过以下引用格式引用该论文： bibtex @misc{wang2025distilqwen25industrialpracticestraining, title={DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models}, author={Chengyu Wang and Junbing Yan and Yuanhao Yue and Jun Huang}, year={2025}, eprint={2504.15027}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.15027} }

提供机构：

maas

创建时间：

2025-02-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集