five

alizeepace/rejection_sampling_phi_2_OA_rm

收藏
Hugging Face2024-01-10 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/alizeepace/rejection_sampling_phi_2_OA_rm
下载链接
链接失效反馈
官方服务:
资源简介:
数据集Rejection Sampling Phi-2 with OpenAssistant RM包含10对提示和响应的数据,这些数据是通过使用OpenAssistant奖励模型对Phi-2生成的10个响应进行拒绝采样得到的。数据集支持问答、文本生成和对话代理等任务。每个数据实例包括模型提示、最佳生成文本和奖励模型得分。数据集未划分训练、验证或测试集。数据集的创建过程涉及人类编写或选择提示,使用Phi-2语言模型生成响应,并通过OpenAssistant奖励模型进行评分。数据集未包含个人或敏感信息,但可能包含冒犯性或有害语言。

数据集Rejection Sampling Phi-2 with OpenAssistant RM包含10对提示和响应的数据,这些数据是通过使用OpenAssistant奖励模型对Phi-2生成的10个响应进行拒绝采样得到的。数据集支持问答、文本生成和对话代理等任务。每个数据实例包括模型提示、最佳生成文本和奖励模型得分。数据集未划分训练、验证或测试集。数据集的创建过程涉及人类编写或选择提示,使用Phi-2语言模型生成响应,并通过OpenAssistant奖励模型进行评分。数据集未包含个人或敏感信息,但可能包含冒犯性或有害语言。
提供机构:
alizeepace
原始信息汇总

数据集卡片:Rejection Sampling Phi-2 with OpenAssistant RM

数据集描述

数据集概述

"Rejection Sampling Phi-2 with OpenAssistant RM" 数据集包含10对提示和响应,这些是通过使用OpenAssistant奖励模型对10个Phi-2生成进行拒绝抽样生成的。

支持的任务和排行榜

该数据集及其创建理由可用于支持问答、文本生成或对话代理模型。

语言

数据集使用的语言是英语(en)。

数据集结构

数据实例

每个实例包含一个模型提示的列表、一个最佳生成字符串和一个奖励模型分数的浮点数。

json { "prompt": ["Write a story about a lion that lives in the city and falls in love with a pigeon."], "generation": "Instruction: Write a story about a lion that lives in the city and falls in love with a pigeon. Output: Leo was a lion who lived in the city. He hated the noise and the pollution. He longed for the freedom and the beauty of the savanna. He spent his days wandering around the streets, looking for prey or a place to nap.

One day, he spotted a pigeon sitting on a ledge. He was hungry and decided to try to catch it. He roared loudly, hoping to scare the bird away. But the pigeon was not afraid. It flew up and landed on his nose.

Leo was surprised and annoyed. He puffed up his mane and tried to shake the pigeon off. But the pigeon was persistent. It pecked at his nose and made jokes.

Leo was irritated and bored. He decided to play along. He rubbed his nose and made funny faces. The pigeon laughed and joined in. They became friends and had fun together.

Leo realized that the pigeon was not so bad. It was actually quite smart and funny. It taught him new words and tricks. It showed him new sights and sounds. It made him forget about his problems and enjoy the moment.

Leo fell in love with the pigeon. He wanted to be with it forever. He asked it to leave the city and follow him to the savanna. The pigeon agreed. It said it would always be his friend and his mate.

Leo and the pigeon left the city and embarked on a new adventure. They found happiness and peace in the wild. They lived happily ever after. ", "rm_score": 3.1899418831 }

数据字段

  • prompt: 包含用于查询Phi-2的提示的列表,每个字符串代表一个回合。
  • generation: 包含从10个生成中得分最高的Phi-2生成,包括格式化的提示。
  • rm_score: 对应于分配给生成的OpenAssistant奖励分数的浮点数。

数据分割

数据集未进行分割。

数据集创建

源数据

初始数据收集和规范化

提示由人类编写或选择,以捕捉一系列指令遵循能力(创造性、事实性、推理等)。

生成是使用Phi-2语言模型为每个提示生成的。根据Phi-2 QA和聊天提示指令,提示按以下理由格式化:

  • 单轮查询:"Instruction: {query} Output: "
  • 多轮查询:"Alice: {query_turn_1} Bob: {query_turn_2}... Alice: {query_turn_n} Bob: "

仅保留得分最高的10个生成中的最佳生成;有关评分过程的详细信息,请参见注释

源语言生产者

10个提示中有2个来自GSM8K数据集,以探测推理能力。其余由人类作家生成。

生成是使用Phi-2语言模型为每个提示生成的。

注释

注释过程

所有生成均使用外部奖励模型进行评分。

注释者

奖励分数是使用Open Assistant奖励模型为每个生成生成的。

个人和敏感信息

无个人和敏感信息。

使用数据的注意事项

数据集的社会影响

一些提示故意设计为探测模型生成或奖励安全无害文本的能力。因此,提示和生成可能包含冒犯性或有毒语言。

其他已知限制

数据集的有限大小不允许彻底探索模型指令遵循能力的所有方面。

附加信息

数据集策展人

数据集由Alizée Pace收集。

许可信息

数据集在Apache-2.0许可证下发布。

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作