selfcorrexp2/selfcorrexp2_llama3_openmath_1m_ep1_tmp10_goldrm_labeled
收藏Hugging Face2025-01-23 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/selfcorrexp2/selfcorrexp2_llama3_openmath_1m_ep1_tmp10_goldrm_labeled
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个字段,如索引(idx),提示(prompt),奖励(rewards),答案(answers),真实标签(gt),代理标签(proxy_label)和二次奖励(second_rewards)。数据集被划分为训练集,包含15000个示例,总共占用38962838字节。数据集的下载大小为15300484字节。但是,README文件中并没有给出数据集的具体内容和用途描述。
The dataset includes multiple fields such as index (idx), prompt, reward (rewards), answer (answers), ground truth label (gt), proxy label (proxy_label), and secondary reward (second_rewards). The dataset is split into a training set with 15,000 examples, totaling 38,962,838 bytes. The download size of the dataset is 15,300,484 bytes. However, the README file does not provide a specific description of the datasets content and purpose.
提供机构:
selfcorrexp2



