arcee-ai/OpenHermesPreferences-binarized
收藏Hugging Face2024-05-18 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/arcee-ai/OpenHermesPreferences-binarized
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: prompt
dtype: string
- name: chosen
list:
- name: content
dtype: string
- name: role
dtype: string
- name: rejected
list:
- name: content
dtype: string
- name: role
dtype: string
splits:
- name: train
num_bytes: 3925868587
num_examples: 989490
download_size: 2036458464
dataset_size: 3925868587
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
license: other
language:
- en
tags:
- dpo
- orpo
- synthetic
- distilabel
size_categories:
- 100K<n<1M
---
**OpenHermesPreferences** is a dataset of **~1 million** AI preferences derived from `teknium/OpenHermes-2.5`.
It combines responses from the source dataset with those from two other models, `Mixtral-8x7B-Instruct-v0.1` and `Nous-Hermes-2-Yi-34B`,
and uses `PairRM` as the preference model to score and rank the generations. The dataset can be used for training preference models or aligning language
models through techniques like Direct Preference Optimization.
Reference: https://huggingface.co/datasets/argilla/OpenHermesPreferences?row=0
提供机构:
arcee-ai
原始信息汇总
数据集概述
数据集信息
- 名称: OpenHermesPreferences
- 大小: 约100万条AI偏好数据
- 来源: 基于
teknium/OpenHermes-2.5,结合Mixtral-8x7B-Instruct-v0.1和Nous-Hermes-2-Yi-34B模型的响应 - 用途: 用于训练偏好模型或通过直接偏好优化等技术对语言模型进行对齐
数据集特征
- prompt: 字符串类型
- chosen:
- content: 字符串类型
- role: 字符串类型
- rejected:
- content: 字符串类型
- role: 字符串类型
数据集分割
- train:
- 数据量: 989490个示例
- 存储大小: 3925868587字节
数据集大小
- 下载大小: 2036458464字节
- 数据集大小: 3925868587字节
语言
- 主要语言: 英语
标签
- dpo
- orpo
- synthetic
- distilabel
许可证
- 类型: 其他
大小分类
- 范围: 100K<n<1M



