selfcorrexp2/llama3_openmath_em_ep1_tmp10_gold_reward
收藏Hugging Face2025-01-07 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/selfcorrexp2/llama3_openmath_em_ep1_tmp10_gold_reward
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含多个字段的数据集,其中包括索引(idx)、真实标签(gt)、提示信息(prompt)、难度级别(level)、类型(type)、解决方案(solution)、用户解决方案(my_solu)、预测结果(pred)、奖励标志(rewards)、消息列表(messages,包含内容和角色)、自定义提示(my_prompt)以及代理奖励(proxy_reward)。数据集分为训练集,共有5000个示例,大小为43991326字节。
This dataset includes multiple fields such as index (idx), ground truth labels (gt), prompt information (prompt), difficulty level (level), type (type), solution (solution), users solution (my_solu), prediction results (pred), reward flags (rewards), a list of messages (including content and role), custom prompt (my_prompt), and proxy reward (proxy_reward). The dataset is split into a training set with a total of 5000 examples, totaling 43991326 bytes in size.
提供机构:
selfcorrexp2



