saepark/hh-rlhf-single-turn-RM-train-furthersplit-policy-train10k

Name: saepark/hh-rlhf-single-turn-RM-train-furthersplit-policy-train10k
Creator: saepark
Published: 2025-10-29 00:22:47
License: 暂无描述

Hugging Face2025-10-29 更新2025-11-15 收录

下载链接：

https://hf-mirror.com/datasets/saepark/hh-rlhf-single-turn-RM-train-furthersplit-policy-train10k

下载链接

链接失效反馈

官方服务：

资源简介：

这是一个包含对话数据的训练集，每个样本包括一个提示(prompt)、提示ID(prompt_id)、选中的回答(chosen)、被拒绝的回答(rejected)、对话消息(messages)、选中回答的评分(score_chosen)、被拒绝回答的评分(score_rejected)和其他信息(other_info，如数据来源)。训练集包含10000个示例。

This is a training dataset containing conversational data, with each sample including a prompt, prompt ID, chosen response, rejected responses, conversation messages, score of the chosen response, score of the rejected responses, and other information such as the data source. The training set contains 10,000 examples.

提供机构：

saepark

5,000+

优质数据集

54 个

任务类型

进入经典数据集