Atsunori/HelpSteer2-DPO

Name: Atsunori/HelpSteer2-DPO
Creator: Atsunori
Published: 2024-07-11 03:09:27
License: 暂无描述

Hugging Face2024-07-11 更新2024-07-22 收录

下载链接：

https://hf-mirror.com/datasets/Atsunori/HelpSteer2-DPO

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是将nvidia/HelpSteer2数据集根据有用性评分转换为偏好对，用于训练DPO模型。具体来说，选择有用性评分较高的响应作为选择的响应，另一个响应则作为被拒绝的响应。如果两个响应的有用性评分相同，则该对将被丢弃。最终的数据集包含7,221个训练样本和373个验证样本。

This dataset is a conversion of the nvidia/HelpSteer2 dataset into preference pairs based on the helpfulness score for training DPO. Specifically, the response with the higher helpfulness score is selected as the chosen response, while the other response is marked as the rejected response. If the helpfulness scores are identical, the pair is discarded. The resulting dataset contains 7,221 training samples and 373 validation samples.

提供机构：

Atsunori

原始信息汇总

HelpSteer2-DPO 数据集概述

基本信息

许可证: CC-BY-4.0
语言: 英语 (en)
标签: human-feedback
数据集名称: HelpSteer2-DPO
大小类别: 1K<n<10K

数据集描述

来源: 由 nvidia/HelpSteer2 数据集转换而来。
转换方法: 根据响应的有用性评分，将评分较高的响应作为 chosen_response，评分较低的响应作为 rejected_response。如果两个响应的有用性评分相同，则丢弃该对数据。
数据集结构:
- 特征:
  - prompt: 字符串类型
  - chosen_response: 字符串类型
  - rejected_response: 字符串类型
- 分割:
  - train: 包含 7221 个样本，大小为 26707924 字节
  - validation: 包含 373 个样本，大小为 1369079 字节
下载大小: 14402663 字节
数据集大小: 28077003 字节

配置

配置名称: default
- 数据文件:
  - train: data/train-*
  - validation: data/validation-*

更新日期

最后更新: 2024-07-11

5,000+

优质数据集

54 个

任务类型

进入经典数据集