five

Self-taught-evaluator-DPO-data

收藏
魔搭社区2025-12-04 更新2024-10-05 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/Self-taught-evaluator-DPO-data
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is released as part of [Self-taught evaluators](https://arxiv.org/abs/2408.02666) research project. Please refer to our [project materials](https://github.com/facebookresearch/RAM/tree/self_taught/projects/self_taught_evaluator) here for training and evaluation details. ## Loading the dataset with transformers This dataset is built upon [WildChat](https://huggingface.co/datasets/allenai/WildChat-1M) prompts by using [Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) to generate responses and evaluation plans. Details on how to build such a self-taught dataset can be found in [Self-taught evaluators](https://arxiv.org/abs/2408.02666). Minimal example below showing how to prepare training data. ```python from datasets import load_dataset dataset = load_dataset("facebook/Self-taught-evaluator-DPO-data") WildChat = load_dataset("allenai/WildChat-1M") hash_id2content = dict() for ex in WildChat["train"]: turn = ex["turn"] hash_id2content[ex["conversation_hash"]] = ex["conversation"][2 * (turn - 1)]["content"] train_data = [] for ex in dataset["train"]: if ex["instruction"] not in hash_id2content: continue else: ex["src"] = ex["src"].replace(ex["instruction"], hash_id2content[ex["instruction"]]) train_data.append(ex) ``` ## Citation If you use data, model, or code from this work, please cite with the following BibTex entry: ``` @article{wang2024self, title={Self-taught evaluators}, author={Wang, Tianlu and Kulikov, Ilia and Golovneva, Olga and Yu, Ping and Yuan, Weizhe and Dwivedi-Yu, Jane and Pang, Richard Yuanzhe and Fazel-Zarandi, Maryam and Weston, Jason and Li, Xian}, journal={arXiv preprint arXiv:2408.02666}, year={2024} } ``` ## License Use of this repository and related resources are governed by [Self-Taught Evaluator Research License](https://huggingface.co/facebook/Self-taught-evaluator-llama3.1-70B/blob/main/Research%20License%20for%20Self-taught%20Evaluator.pdf).

本数据集系作为**Self-taught evaluators(自主学习评估器)**研究项目的组成部分发布。有关训练与评估的详细信息,请参阅本项目的[配套资料](https://github.com/facebookresearch/RAM/tree/self_taught/projects/self_taught_evaluator)。 ## 使用Transformers库加载数据集 本数据集基于[WildChat](https://huggingface.co/datasets/allenai/WildChat-1M)的提示构建,通过[Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct)生成回复与评估方案。关于此类自主学习数据集的构建细节,可参阅[Self-taught evaluators(自主学习评估器)](https://arxiv.org/abs/2408.02666)研究论文。 以下为用于准备训练数据的极简示例代码: python from datasets import load_dataset dataset = load_dataset("facebook/Self-taught-evaluator-DPO-data") WildChat = load_dataset("allenai/WildChat-1M") hash_id2content = dict() for ex in WildChat["train"]: turn = ex["turn"] hash_id2content[ex["conversation_hash"]] = ex["conversation"][2 * (turn - 1)]["content"] train_data = [] for ex in dataset["train"]: if ex["instruction"] not in hash_id2content: continue else: ex["src"] = ex["src"].replace(ex["instruction"], hash_id2content[ex["instruction"]]) train_data.append(ex) ## 引用说明 若您使用本工作中的数据、模型或代码,请按照以下BibTex格式进行引用: @article{wang2024self, title={Self-taught evaluators}, author={Wang, Tianlu and Kulikov, Ilia and Golovneva, Olga and Yu, Ping and Yuan, Weizhe and Dwivedi-Yu, Jane and Pang, Richard Yuanzhe and Fazel-Zarandi, Maryam and Weston, Jason and Li, Xian}, journal={arXiv preprint arXiv:2408.02666}, year={2024} } ## 许可协议 本仓库及相关资源的使用受[Self-Taught Evaluator Research License(自主学习评估器研究许可协议)](https://huggingface.co/facebook/Self-taught-evaluator-llama3.1-70B/blob/main/Research%20License%20for%20Self-taught%20Evaluator.pdf)约束。
提供机构:
maas
创建时间:
2024-10-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作