mytestdpo/llama3_8b_it_gsm8k1_gold_tmp07_with_orm_rewards
收藏Hugging Face2025-01-06 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/mytestdpo/llama3_8b_it_gsm8k1_gold_tmp07_with_orm_rewards
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含六个字段:索引(idx),提示(prompt),答案(answers),正确答案(gt),代理标签(proxy_label)和奖励(rewards)。数据集分为训练集,共有5000个示例,数据集大小为12503981字节,下载大小为4664800字节。
The dataset includes six fields: index (idx), prompt, answers, ground truth (gt), proxy label, and rewards. The dataset is split into a training set with a total of 5000 examples, with a dataset size of 12503981 bytes and a download size of 4664800 bytes.
提供机构:
mytestdpo



