teamcore/DPO_Pm3B_U0_beta0.25dpo_proEurus_RM_7b_nu0.008
收藏Hugging Face2025-10-21 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/teamcore/DPO_Pm3B_U0_beta0.25dpo_proEurus_RM_7b_nu0.008
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了源文本、指令、模型列表以及完成情况等信息。完成情况中包含了对帮助性、诚实性、指令遵循性和真实性的评分及评分理由。此外,数据集还包含了批评、自定义系统提示、细致评分、模型名称、总体评分、原则、响应、正确答案、错误答案、提示文本、选择和拒绝的文本以及与Eurus_RM_7b模型相关的评分和概率。数据集分为默认split,共有3187个示例。
The dataset includes source text, instructions, a list of models, and completion information. The completion information contains ratings and rationales for helpfulness, honesty, instruction following, and truthfulness. Additionally, the dataset includes critique, custom system prompts, fine-grained scores, model names, overall scores, principles, responses, correct answers, incorrect answers, prompt text, chosen and rejected text, and ratings and probabilities related to the Eurus_RM_7b model. The dataset is split into a default split with a total of 3187 examples.
提供机构:
teamcore



