teamcore/DPO_Pm3B_U0_beta0.25dr_dpoEurus_RM_7bbt_noise_flip0.3g
收藏Hugging Face2025-10-23 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/teamcore/DPO_Pm3B_U0_beta0.25dr_dpoEurus_RM_7bbt_noise_flip0.3g
下载链接
链接失效反馈官方服务:
资源简介:
这是一个用于测试回应反驳的数据集,包含源文本(source)、指令(instruction)、模型(models)、完成情况(completions)等信息。完成情况中包括对帮助性、诚实性、指令遵循性和真实性等方面的评分和理由。数据集还提供了评分、反馈、选择和拒绝的答案等信息。
This is a dataset for testing response rebuttals, containing source text (source), instructions (instruction), models (models), and completions (completions). The completions include ratings and reasons for helpfulness, honesty, instruction following, and truthfulness. The dataset also provides information on scores, feedback, chosen and rejected answers.
提供机构:
teamcore



