stochastic-parrots/MNLP_M1_Preference_dpo_dataset
收藏Hugging Face2025-05-24 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/stochastic-parrots/MNLP_M1_Preference_dpo_dataset
下载链接
链接失效反馈官方服务:
资源简介:
M1偏好数据集用于直接偏好优化训练,包含处理过的M1偏好数据。数据集由CS-552随机鹦鹉团队创建,共有17615个示例。数据来源于与大型语言模型(如ChatGPT)的互动,用于EPFL的CS-552课程。数据集格式符合直接偏好优化训练的要求,包括问题正文、问题选项、提示、首选答案、次选答案、课程ID、问题ID、问题类型和排名标准等字段。
This dataset contains processed M1 preference data for Direct Preference Optimization (DPO) training, created by the CS-552 Stochastic Parrots Team with a total of 17615 examples. The data is sourced from interactions with large language models (such as ChatGPT) for the CS-552 course at EPFL. The dataset format meets the requirements for DPO training, including fields such as question body, question options, prompt, chosen response, rejected response, course ID, question ID, question type, and ranking criteria.
提供机构:
stochastic-parrots



