HuggingFaceH4/instruct_me
收藏Hugging Face2023-03-06 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/HuggingFaceH4/instruct_me
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
dataset_info:
- config_name: instruction_tuning
features:
- name: text
dtype: string
- name: meta
struct:
- name: source
dtype: string
- name: config
dtype: string
splits:
- name: train
num_bytes: 29975565
num_examples: 41685
- name: test
num_bytes: 3298059
num_examples: 4632
download_size: 18425612
dataset_size: 33273624
- config_name: reward_modelling
features:
- name: text
dtype: string
- name: meta
struct:
- name: source
dtype: string
- name: config
dtype: string
splits:
- name: train
num_bytes: 25274204
num_examples: 41685
- name: test
num_bytes: 2777314
num_examples: 4632
download_size: 15636566
dataset_size: 28051518
- config_name: ppo
features:
- name: prompt
dtype: string
- name: meta
struct:
- name: source
dtype: string
- name: config
dtype: string
splits:
- name: train
num_bytes: 50787070
num_examples: 83371
- name: test
num_bytes: 5715727
num_examples: 9264
download_size: 31461165
dataset_size: 56502797
- config_name: reward_modeling
features:
- name: prompt
dtype: string
- name: meta
struct:
- name: source
dtype: string
- name: config
dtype: string
splits:
- name: train
num_bytes: 25274204
num_examples: 41685
- name: test
num_bytes: 2777314
num_examples: 4632
download_size: 15636838
dataset_size: 28051518
task_categories:
- conversational
- text-generation
language:
- en
tags:
- human-feedback
- instruct
- reward-modeling
pretty_name: Instruct Me
---
# Dataset card for Instruct Me
## Dataset Description
- **Homepage:**
- **Repository:**
- **Paper:**
- **Leaderboard:**
- **Point of Contact:** Lewis Tunstall
### Dataset summary
Instruct Me is a dataset of prompts and instruction dialogues between a human user and AI assistant. The prompts are derived from (prompt, completion) pairs in the [Helpful Instructions dataset](https://huggingface.co/datasets/HuggingFaceH4/helpful_instructions). The goal is to train a language model to that is "chatty" and can answer the kind of questions or tasks a human user might instruct an AI assistant to perform.
### Supported Tasks and Leaderboard
We provide 3 configs that can be used for training RLHF models:
#### instruction_tuning
Single-turn user/bot dialogues for instruction tuning.
#### reward_modeling
Prompts to generate model completions and collect human preference data
#### ppo
Prompts to generate model completions for optimization of the instruction-tuned model with techniques like PPO.
### Changelog
* March 6, 2023: `v1.1.0` release. Changed the `text` columns for the `reward_modeling` and `ppo` configs to `prompt` for consistency with our dataset schemas elsewhere.
* March 5, 2023: `v1.0.0` release.
提供机构:
HuggingFaceH4
原始信息汇总
数据集概述
基本信息
- 许可证: Apache-2.0
数据集配置
配置一: instruction_tuning
- 特征:
- text: 字符串类型
- meta: 结构化数据,包含source和config,均为字符串类型
- 分割:
- train: 41685个样本,29975565字节
- test: 4632个样本,3298059字节
- 下载大小: 18425612字节
- 数据集大小: 33273624字节
配置二: reward_modelling
- 特征:
- text: 字符串类型
- meta: 结构化数据,包含source和config,均为字符串类型
- 分割:
- train: 41685个样本,25274204字节
- test: 4632个样本,2777314字节
- 下载大小: 15636566字节
- 数据集大小: 28051518字节
配置三: ppo
- 特征:
- prompt: 字符串类型
- meta: 结构化数据,包含source和config,均为字符串类型
- 分割:
- train: 83371个样本,50787070字节
- test: 9264个样本,5715727字节
- 下载大小: 31461165字节
- 数据集大小: 56502797字节
配置四: reward_modelling
- 特征:
- prompt: 字符串类型
- meta: 结构化数据,包含source和config,均为字符串类型
- 分割:
- train: 41685个样本,25274204字节
- test: 4632个样本,2777314字节
- 下载大小: 15636838字节
- 数据集大小: 28051518字节
语言
- 支持语言: 英语
任务类别
- 对话
- 文本生成
标签
- 人类反馈
- 指令
- 奖励建模



