Magpie-100K-Generator-Zoo
收藏魔搭社区2025-12-05 更新2025-01-18 收录
下载链接:
https://modelscope.cn/datasets/Magpie-Align/Magpie-100K-Generator-Zoo
下载链接
链接失效反馈官方服务:
资源简介:
### About This Dataset
This dataset is used by the paper ["Stronger Models are NOT Stronger Teachers for Instruction Tuning"](https://huggingface.co/papers/2411.07133).
To create this dataset, instructions are sampled from [Magpie-Air](https://huggingface.co/datasets/Magpie-Align/Llama-3-Magpie-Air-3M-v0.1). Responses are generated using 19 different response generators.
You can build a DPO dataset based on reward values we provided.
**Questions?** Contact [Zhangchen](https://www.zhangchenxu.com) by email.
### Citation
```
@article{xu2024stronger,
title={Stronger Models are NOT Stronger Teachers for Instruction Tuning},
author={Xu, Zhangchen and Jiang, Fengqing and Niu, Luyao and Lin, Bill Yuchen and Poovendran, Radha},
journal={arXiv preprint arXiv:2411.07133},
year={2024}
}
```
### 数据集简介
本数据集已被论文《更强的模型并非指令微调的更强教师》(https://huggingface.co/papers/2411.07133)所使用。
为构建本数据集,我们从Magpie-Air数据集(https://huggingface.co/datasets/Magpie-Align/Llama-3-Magpie-Air-3M-v0.1)中采样获取指令样本,回复内容由19种不同的回复生成器生成。
您可依据我们提供的奖励值构建直接偏好优化(DPO)数据集。
**如有疑问?** 可通过邮件联系张晨(https://www.zhangchenxu.com)。
### 引用格式
@article{xu2024stronger,
title={Stronger Models are NOT Stronger Teachers for Instruction Tuning},
author={Xu, Zhangchen and Jiang, Fengqing and Niu, Luyao and Lin, Bill Yuchen and Poovendran, Radha},
journal={arXiv preprint arXiv:2411.07133},
year={2024}
}
提供机构:
maas
创建时间:
2025-01-15



