saepark/hh-rlhf-single-turn-RM-train-furthersplit-policy-train10k
收藏Hugging Face2025-10-29 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/saepark/hh-rlhf-single-turn-RM-train-furthersplit-policy-train10k
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含对话数据的训练集,每个样本包括一个提示(prompt)、提示ID(prompt_id)、选中的回答(chosen)、被拒绝的回答(rejected)、对话消息(messages)、选中回答的评分(score_chosen)、被拒绝回答的评分(score_rejected)和其他信息(other_info,如数据来源)。训练集包含10000个示例。
This is a training dataset containing conversational data, with each sample including a prompt, prompt ID, chosen response, rejected responses, conversation messages, score of the chosen response, score of the rejected responses, and other information such as the data source. The training set contains 10,000 examples.
提供机构:
saepark



