vicgalle/OpenHermesPreferences-1k
收藏Hugging Face2024-02-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/vicgalle/OpenHermesPreferences-1k
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: source
dtype: string
- name: category
dtype: string
- name: prompt
dtype: string
- name: candidates_completions
sequence: string
- name: candidate_policies
sequence: string
- name: ranks
sequence: int64
- name: rank_str
dtype: string
- name: chosen_policy
dtype: string
- name: rejected_policy
dtype: string
- name: chosen
dtype: string
- name: rejected
dtype: string
- name: len_chosen_response
dtype: int64
- name: len_rejected_response
dtype: int64
- name: __index_level_0__
dtype: int64
splits:
- name: train
num_bytes: 22437232
num_examples: 1111
download_size: 10508529
dataset_size: 22437232
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
license: other
source_datasets:
- argilla/OpenHermesPreferences
size_categories:
- 1K<n<10K
task_categories:
- text-generation
pretty_name: OpenHermesPreferences-1k
tags:
- synthetic
- rlaif
- dpo
language:
- en
---
## OpenHermesPreferences-1k ⚗️

OpenHermesPreferences-1k is a dataset of ~1,000 samples derived from [argilla/OpenHermesPreferences](https://huggingface.co/datasets/argilla/OpenHermesPreferences) using the [Long is More for Alignment](https://arxiv.org/abs/2402.04833) protocol. This protocol consists of selecting the ~1,000 longest responses (for the preferred/chosen ones) and provides a strong baseline to measure performance against.
Instead of uniform sampling across the dataset categories, we used stratified sampling to keep all the categories, leading to the following distribution of categories:
| category | count |
|:--------------------------|--------:|
| None | 400 |
| orca | 221 |
| coding | 110 |
| general | 85 |
| trivia | 50 |
| roleplay | 42 |
| writing | 31 |
| wordgame | 20 |
| stylized_response | 19 |
| joke | 17 |
| multiple_choice | 17 |
| plan | 13 |
| riddle | 12 |
| rp | 10 |
| misconception | 9 |
| gtkm | 8 |
| theory_of_mind | 7 |
| awareness | 7 |
| summarization | 5 |
| cot | 5 |
| counterfactual_contextual | 5 |
| editor | 4 |
| song | 4 |
| card | 2 |
| agent | 2 |
| experience | 2 |
| greeting | 2 |
| quiz | 1 |
| detailed_writing | 1 |
## Usage
The dataset already has the columns `prompt`, `chosen` and `rejected`, so it is trivially compatible with the [DPOTrainer](https://huggingface.co/docs/trl/en/dpo_trainer) from the trl library.
## License
`OpenHermesPreferences-1k` inherits the same license as the source dataset [`teknium/OpenHermes-2.5`](https://huggingface.co/datasets/teknium/OpenHermes-2.5) which is currently listed as `other` to account for the varying licenses in each source.
提供机构:
vicgalle
原始信息汇总
OpenHermesPreferences-1k 数据集概述
数据集信息
-
特征列表:
source: 字符串类型category: 字符串类型prompt: 字符串类型candidates_completions: 字符串序列candidate_policies: 字符串序列ranks: 整数序列rank_str: 字符串类型chosen_policy: 字符串类型rejected_policy: 字符串类型chosen: 字符串类型rejected: 字符串类型len_chosen_response: 整数类型len_rejected_response: 整数类型__index_level_0__: 整数类型
-
数据分割:
train: 包含 1111 个样本,占用 22437232 字节
-
下载大小:10508529 字节
-
数据集大小:22437232 字节
配置信息
- 配置名称:default
- 数据文件:
train: 路径为data/train-*
- 数据文件:
许可证
- 许可证类型:other
源数据集
- 源数据集:
argilla/OpenHermesPreferences
数据集规模
- 规模分类:1K<n<10K
任务分类
- 任务类型:text-generation
数据集名称
- 名称:OpenHermesPreferences-1k
标签
- 标签:synthetic, rlaif, dpo
语言
- 语言:英语
类别分布
- None: 400
- orca: 221
- coding: 110
- general: 85
- trivia: 50
- roleplay: 42
- writing: 31
- wordgame: 20
- stylized_response: 19
- joke: 17
- multiple_choice: 17
- plan: 13
- riddle: 12
- rp: 10
- misconception: 9
- gtkm: 8
- theory_of_mind: 7
- awareness: 7
- summarization: 5
- cot: 5
- counterfactual_contextual: 5
- editor: 4
- song: 4
- card: 2
- agent: 2
- experience: 2
- greeting: 2
- quiz: 1
- detailed_writing: 1
使用说明
- 数据集包含
prompt,chosen和rejected列,与trl库中的DPOTrainer兼容。
许可证说明
- 数据集继承源数据集
teknium/OpenHermes-2.5的许可证,目前列为other。



