Self-taught-evaluator-DPO-data
收藏魔搭社区2025-12-04 更新2024-10-05 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/Self-taught-evaluator-DPO-data
下载链接
链接失效反馈官方服务:
资源简介:
This dataset is released as part of [Self-taught evaluators](https://arxiv.org/abs/2408.02666) research project.
Please refer to our [project materials](https://github.com/facebookresearch/RAM/tree/self_taught/projects/self_taught_evaluator) here for training and evaluation details.
## Loading the dataset with transformers
This dataset is built upon [WildChat](https://huggingface.co/datasets/allenai/WildChat-1M) prompts by using [Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) to generate responses and evaluation plans. Details on how to build such a self-taught dataset can be found in [Self-taught evaluators](https://arxiv.org/abs/2408.02666).
Minimal example below showing how to prepare training data.
```python
from datasets import load_dataset
dataset = load_dataset("facebook/Self-taught-evaluator-DPO-data")
WildChat = load_dataset("allenai/WildChat-1M")
hash_id2content = dict()
for ex in WildChat["train"]:
turn = ex["turn"]
hash_id2content[ex["conversation_hash"]] = ex["conversation"][2 * (turn - 1)]["content"]
train_data = []
for ex in dataset["train"]:
if ex["instruction"] not in hash_id2content:
continue
else:
ex["src"] = ex["src"].replace(ex["instruction"], hash_id2content[ex["instruction"]])
train_data.append(ex)
```
## Citation
If you use data, model, or code from this work, please cite with the following BibTex entry:
```
@article{wang2024self,
title={Self-taught evaluators},
author={Wang, Tianlu and Kulikov, Ilia and Golovneva, Olga and Yu, Ping and Yuan, Weizhe and Dwivedi-Yu, Jane and Pang, Richard Yuanzhe and Fazel-Zarandi, Maryam and Weston, Jason and Li, Xian},
journal={arXiv preprint arXiv:2408.02666},
year={2024}
}
```
## License
Use of this repository and related resources are governed by [Self-Taught Evaluator Research License](https://huggingface.co/facebook/Self-taught-evaluator-llama3.1-70B/blob/main/Research%20License%20for%20Self-taught%20Evaluator.pdf).
本数据集系作为**Self-taught evaluators(自主学习评估器)**研究项目的组成部分发布。有关训练与评估的详细信息,请参阅本项目的[配套资料](https://github.com/facebookresearch/RAM/tree/self_taught/projects/self_taught_evaluator)。
## 使用Transformers库加载数据集
本数据集基于[WildChat](https://huggingface.co/datasets/allenai/WildChat-1M)的提示构建,通过[Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct)生成回复与评估方案。关于此类自主学习数据集的构建细节,可参阅[Self-taught evaluators(自主学习评估器)](https://arxiv.org/abs/2408.02666)研究论文。
以下为用于准备训练数据的极简示例代码:
python
from datasets import load_dataset
dataset = load_dataset("facebook/Self-taught-evaluator-DPO-data")
WildChat = load_dataset("allenai/WildChat-1M")
hash_id2content = dict()
for ex in WildChat["train"]:
turn = ex["turn"]
hash_id2content[ex["conversation_hash"]] = ex["conversation"][2 * (turn - 1)]["content"]
train_data = []
for ex in dataset["train"]:
if ex["instruction"] not in hash_id2content:
continue
else:
ex["src"] = ex["src"].replace(ex["instruction"], hash_id2content[ex["instruction"]])
train_data.append(ex)
## 引用说明
若您使用本工作中的数据、模型或代码,请按照以下BibTex格式进行引用:
@article{wang2024self,
title={Self-taught evaluators},
author={Wang, Tianlu and Kulikov, Ilia and Golovneva, Olga and Yu, Ping and Yuan, Weizhe and Dwivedi-Yu, Jane and Pang, Richard Yuanzhe and Fazel-Zarandi, Maryam and Weston, Jason and Li, Xian},
journal={arXiv preprint arXiv:2408.02666},
year={2024}
}
## 许可协议
本仓库及相关资源的使用受[Self-Taught Evaluator Research License(自主学习评估器研究许可协议)](https://huggingface.co/facebook/Self-taught-evaluator-llama3.1-70B/blob/main/Research%20License%20for%20Self-taught%20Evaluator.pdf)约束。
提供机构:
maas
创建时间:
2024-10-04



