CarperAI Human Preference Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://huggingface.co/datasets/CarperAI/openai_summarize_comparisons
下载链接
链接失效反馈官方服务:
资源简介:
该数据集由CarperAI提供,包含了针对各种任务的人类偏好。该数据集被用于COFS-DPO方法中的强化学习人类反馈(RLHF)。其任务领域为跨领域学习。
This dataset is provided by CarperAI and contains human preferences for various tasks. It is employed for reinforcement learning from human feedback (RLHF) in the COFS-DPO method, with its task domain focusing on cross-domain learning.
提供机构:
CarperAI



