DPO-En-Zh-20k
收藏魔搭社区2025-12-01 更新2024-12-21 收录
下载链接:
https://modelscope.cn/datasets/baierfa/DPO-En-Zh-20k
下载链接
链接失效反馈官方服务:
资源简介:
This dataset is composed by
- 4,000 examples of [argilla/distilabel-capybara-dpo-7k-binarized](https://huggingface.co/datasets/argilla/distilabel-capybara-dpo-7k-binarized) with chosen score>=4.
- 3,000 examples of [argilla/distilabel-intel-orca-dpo-pairs](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs) with chosen score>=8.
- 3,000 examples of [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned) with chosen score>=4.
- 10,000 examples of [wenbopan/Chinese-dpo-pairs](https://huggingface.co/datasets/wenbopan/Chinese-dpo-pairs).
You can use it in [LLaMA Factory](https://github.com/hiyouga/LLaMA-Factory) by specifying `dataset: dpo_mix_en,dpo_mix_zh`.
本数据集由以下部分构成:
- 4000条取自[argilla/distilabel-capybara-dpo-7k-binarized](https://huggingface.co/datasets/argilla/distilabel-capybara-dpo-7k-binarized)的样本,且其选中回复的评分不低于4。
- 3000条取自[argilla/distilabel-intel-orca-dpo-pairs](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs)的样本,且其选中回复的评分不低于8。
- 3000条取自[argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned)的样本,且其选中回复的评分不低于4。
- 10000条取自[wenbopan/Chinese-dpo-pairs](https://huggingface.co/datasets/wenbopan/Chinese-dpo-pairs)的样本。
用户可通过指定`dataset: dpo_mix_en,dpo_mix_zh`在LLaMA Factory中使用该数据集。
提供机构:
maas
创建时间:
2024-12-17



