instruct_me
收藏魔搭社区2025-12-05 更新2025-02-15 收录
下载链接:
https://modelscope.cn/datasets/HuggingFaceH4/instruct_me
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset card for Instruct Me
## Dataset Description
- **Homepage:**
- **Repository:**
- **Paper:**
- **Leaderboard:**
- **Point of Contact:** Lewis Tunstall
### Dataset summary
Instruct Me is a dataset of prompts and instruction dialogues between a human user and AI assistant. The prompts are derived from (prompt, completion) pairs in the [Helpful Instructions dataset](https://huggingface.co/datasets/HuggingFaceH4/helpful_instructions). The goal is to train a language model to that is "chatty" and can answer the kind of questions or tasks a human user might instruct an AI assistant to perform.
### Supported Tasks and Leaderboard
We provide 3 configs that can be used for training RLHF models:
#### instruction_tuning
Single-turn user/bot dialogues for instruction tuning.
#### reward_modeling
Prompts to generate model completions and collect human preference data
#### ppo
Prompts to generate model completions for optimization of the instruction-tuned model with techniques like PPO.
### Changelog
* March 6, 2023: `v1.1.0` release. Changed the `text` columns for the `reward_modeling` and `ppo` configs to `prompt` for consistency with our dataset schemas elsewhere.
* March 5, 2023: `v1.0.0` release.
# Instruct Me 数据集卡片
## 数据集说明
- **主页:**
- **代码仓库:**
- **论文:**
- **排行榜:**
- **联系人:** Lewis Tunstall
### 数据集概述
Instruct Me 是一个收录人类用户与AI助手之间的提示词与指令对话的数据集。其提示词源自[Helpful Instructions数据集](https://huggingface.co/datasets/HuggingFaceH4/helpful_instructions)中的(提示词,回复)配对样本。本数据集旨在训练具备自然会话能力的大语言模型(Large Language Model),使其能够响应人类用户向AI助手发起的各类查询与任务指令。
### 支持任务与排行榜
我们提供三类数据集配置,可用于训练基于人类反馈的强化学习(Reinforcement Learning from Human Feedback, RLHF)模型:
#### 指令微调(instruction_tuning)
用于指令微调的单轮用户/助手对话数据。
#### 奖励模型训练(reward_modeling)
用于生成模型回复并收集人类偏好数据的提示词样本。
#### 近端策略优化(PPO)
用于生成模型回复,以借助近端策略优化(PPO)等技术对经指令微调的模型进行优化的提示词样本。
### 更新日志
* 2023年3月6日:发布`v1.1.0`版本。为与本项目其他数据集的架构保持一致,将`reward_modeling`与`ppo`配置中的`text`字段重命名为`prompt`。
* 2023年3月5日:发布`v1.0.0`版本。
提供机构:
maas
创建时间:
2025-02-10



