LRudL/anthropic_hh_modified
收藏Hugging Face2023-03-21 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/LRudL/anthropic_hh_modified
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: choice0
dtype: string
- name: choice1
dtype: string
- name: label
dtype:
class_label:
names:
'0': '0'
'1': '1'
splits:
- name: train
num_bytes: 56635938
num_examples: 42537
- name: test
num_bytes: 3195756
num_examples: 2312
download_size: 33135346
dataset_size: 59831694
---
# Dataset Card for "anthropic_hh_modified"
Total copy of [Anthropic/hh-rlhf](https://huggingface.co/datasets/Anthropic/hh-rlhf) (all credit and rights to the authors of that), just with some modifications to the format so that it can be used with the [eleuther-elk](https://github.com/EleutherAI/elk/tree/main/elk) repository.
Changes:
- rename column "chosen" to "choice0" and "rejected" to "choice1"
- randomly flip the entry in column choice0 and choice1 for half of the entries
- create a ClassLabel column "label" that stores an integer 0 or 1, corresponding to which of choice0 or choice1 was preferred by the human.
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
LRudL
原始信息汇总
数据集概述
数据集名称
"anthropic_hh_modified"
数据结构
- 特征名称与类型
choice0: 字符串类型choice1: 字符串类型label: 分类标签类型,包含两个类别:0 和 1
数据分割
- 训练集
- 样本数量: 42537
- 数据大小: 56635938 字节
- 测试集
- 样本数量: 2312
- 数据大小: 3195756 字节
数据集大小
- 下载大小: 33135346 字节
- 总数据集大小: 59831694 字节



