five

HuggingFaceH4/instruct_me

收藏
Hugging Face2023-03-06 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/HuggingFaceH4/instruct_me
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 dataset_info: - config_name: instruction_tuning features: - name: text dtype: string - name: meta struct: - name: source dtype: string - name: config dtype: string splits: - name: train num_bytes: 29975565 num_examples: 41685 - name: test num_bytes: 3298059 num_examples: 4632 download_size: 18425612 dataset_size: 33273624 - config_name: reward_modelling features: - name: text dtype: string - name: meta struct: - name: source dtype: string - name: config dtype: string splits: - name: train num_bytes: 25274204 num_examples: 41685 - name: test num_bytes: 2777314 num_examples: 4632 download_size: 15636566 dataset_size: 28051518 - config_name: ppo features: - name: prompt dtype: string - name: meta struct: - name: source dtype: string - name: config dtype: string splits: - name: train num_bytes: 50787070 num_examples: 83371 - name: test num_bytes: 5715727 num_examples: 9264 download_size: 31461165 dataset_size: 56502797 - config_name: reward_modeling features: - name: prompt dtype: string - name: meta struct: - name: source dtype: string - name: config dtype: string splits: - name: train num_bytes: 25274204 num_examples: 41685 - name: test num_bytes: 2777314 num_examples: 4632 download_size: 15636838 dataset_size: 28051518 task_categories: - conversational - text-generation language: - en tags: - human-feedback - instruct - reward-modeling pretty_name: Instruct Me --- # Dataset card for Instruct Me ## Dataset Description - **Homepage:** - **Repository:** - **Paper:** - **Leaderboard:** - **Point of Contact:** Lewis Tunstall ### Dataset summary Instruct Me is a dataset of prompts and instruction dialogues between a human user and AI assistant. The prompts are derived from (prompt, completion) pairs in the [Helpful Instructions dataset](https://huggingface.co/datasets/HuggingFaceH4/helpful_instructions). The goal is to train a language model to that is "chatty" and can answer the kind of questions or tasks a human user might instruct an AI assistant to perform. ### Supported Tasks and Leaderboard We provide 3 configs that can be used for training RLHF models: #### instruction_tuning Single-turn user/bot dialogues for instruction tuning. #### reward_modeling Prompts to generate model completions and collect human preference data #### ppo Prompts to generate model completions for optimization of the instruction-tuned model with techniques like PPO. ### Changelog * March 6, 2023: `v1.1.0` release. Changed the `text` columns for the `reward_modeling` and `ppo` configs to `prompt` for consistency with our dataset schemas elsewhere. * March 5, 2023: `v1.0.0` release.
提供机构:
HuggingFaceH4
原始信息汇总

数据集概述

基本信息

  • 许可证: Apache-2.0

数据集配置

配置一: instruction_tuning

  • 特征:
    • text: 字符串类型
    • meta: 结构化数据,包含source和config,均为字符串类型
  • 分割:
    • train: 41685个样本,29975565字节
    • test: 4632个样本,3298059字节
  • 下载大小: 18425612字节
  • 数据集大小: 33273624字节

配置二: reward_modelling

  • 特征:
    • text: 字符串类型
    • meta: 结构化数据,包含source和config,均为字符串类型
  • 分割:
    • train: 41685个样本,25274204字节
    • test: 4632个样本,2777314字节
  • 下载大小: 15636566字节
  • 数据集大小: 28051518字节

配置三: ppo

  • 特征:
    • prompt: 字符串类型
    • meta: 结构化数据,包含source和config,均为字符串类型
  • 分割:
    • train: 83371个样本,50787070字节
    • test: 9264个样本,5715727字节
  • 下载大小: 31461165字节
  • 数据集大小: 56502797字节

配置四: reward_modelling

  • 特征:
    • prompt: 字符串类型
    • meta: 结构化数据,包含source和config,均为字符串类型
  • 分割:
    • train: 41685个样本,25274204字节
    • test: 4632个样本,2777314字节
  • 下载大小: 15636838字节
  • 数据集大小: 28051518字节

语言

  • 支持语言: 英语

任务类别

  • 对话
  • 文本生成

标签

  • 人类反馈
  • 指令
  • 奖励建模
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作