cchoi1/humaneval_qwen7b_att_iter0_ppo_att50_sol50_relabeled_dpo_1000
收藏Hugging Face2025-04-02 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/cchoi1/humaneval_qwen7b_att_iter0_ppo_att50_sol50_relabeled_dpo_1000
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含四个字段:提示(prompt)、选中(chosen)、拒绝(rejected)和任务ID(task_id)。数据集分为训练集和测试集,训练集有1000个样本,测试集有200个样本。
The dataset includes four fields: prompt, chosen, rejected, and task_id. It is divided into a training set with 1000 samples and a test set with 200 samples.
提供机构:
cchoi1



