Columbia-NLP/DPO-py-dpo-v0.1

Name: Columbia-NLP/DPO-py-dpo-v0.1
Creator: Columbia-NLP
Published: 2024-07-10 16:06:35
License: 暂无描述

Hugging Face2024-07-10 更新2024-07-06 收录

下载链接：

https://hf-mirror.com/datasets/Columbia-NLP/DPO-py-dpo-v0.1

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是从jondurbin/py-dpo-v0.1数据集重新格式化而来，用于DPO数据集的统一格式。它被用于训练LION-series模型，这些模型通过三个阶段（SFT、DPO和在线偏好学习）进行优化，以提高语言模型的性能。

This dataset is reformatted from jondurbin/py-dpo-v0.1 for training the LION-series models. It includes features such as prompt, prompt_id, chosen, rejected, messages, score_chosen, score_rejected, and other_info. The chosen and rejected features contain content and role, and messages also contain content and role. The dataset is split into a train set with 9466 examples. The LION-series models are trained using an optimized pipeline consisting of SFT, DPO, and online preference learning (online DPO).

提供机构：

Columbia-NLP

原始信息汇总

数据集概述

数据集信息

特征:
- prompt: 类型为 string
- prompt_id: 类型为 string
- chosen: 包含以下子特征
  - content: 类型为 string
  - role: 类型为 string
- rejected: 包含以下子特征
  - content: 类型为 string
  - role: 类型为 string
- messages: 包含以下子特征
  - content: 类型为 string
  - role: 类型为 string
- score_chosen: 类型为 float64
- score_rejected: 类型为 float64
- other_info: 包含以下子特征
  - source: 类型为 string
分割:
- train:
  - 字节数: 66121089
  - 样本数: 9466
下载大小: 26923890 字节
数据集大小: 66121089 字节

配置

配置名称: default
- 数据文件:
  - split: train
  - path: data/train-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集