RLHF-And-Friends/Human-vs-Shapa-8x

Name: RLHF-And-Friends/Human-vs-Shapa-8x
Creator: RLHF-And-Friends
Published: 2025-03-28 10:54:20
License: 暂无描述

Hugging Face2025-03-28 更新2025-04-26 收录

下载链接：

https://hf-mirror.com/datasets/RLHF-And-Friends/Human-vs-Shapa-8x

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了来自RLHF-And-Friends/tldr-sft测试分割的人类补全和borisshapa/ppo-8x-mistral-7b-smallsft-tldr模型的补全。数据集包含一个名为prompt的列，用于存储提供给人类和模型的提示。使用的模型为ppo-8x-mistral-7b-smallsft-tldr，该模型在gpt-4o-mini和gpt-4o的评估下，相对于人类的补全有较高的胜率。

This dataset includes human completions from the RLHF-And-Friends/tldr-sft test split and completions from the borisshapa/ppo-8x-mistral-7b-smallsft-tldr model. It contains a column named prompt, which holds the prompts given to both humans and the model. The model used is ppo-8x-mistral-7b-smallsft-tldr, which has a high winrate over human completions according to the gpt-4o-mini and gpt-4o opinions.

提供机构：

RLHF-And-Friends

5,000+

优质数据集

54 个

任务类型

进入经典数据集