RLHFlow/self_rewarding_turn1_with_rewards_example
收藏Hugging Face2025-03-02 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/RLHFlow/self_rewarding_turn1_with_rewards_example
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了问答对以及相关提示信息,其中answers字段存储答案,gt字段存储标准答案,rewards字段表示奖励(可能为正确与否的标记),problem字段存储问题,prompt_messages字段存储与问题相关的提示信息,包括内容和角色。数据集分为训练集,共有40000个示例,文件大小为775,105,805字节。
The dataset includes question-answer pairs and related prompt information, where the answers field stores answers, the gt field stores the standard answers, the rewards field indicates rewards (possibly correctness markers), the problem field stores questions, and the prompt_messages field stores prompt information related to the questions, including content and role. The dataset is split into a training set, with a total of 40,000 examples, and a file size of 775,105,805 bytes.
提供机构:
RLHFlow



