selfcorrexp2/orm-less-corr-label_llama3_sft_tmp10
收藏Hugging Face2025-01-09 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/selfcorrexp2/orm-less-corr-label_llama3_sft_tmp10
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了索引(idx)、真实值(gt)、提示信息(prompt)、难度等级(level)、类型(type)、解决方案(solution)、用户解决方案(my_solu)、预测值(pred)、奖励标志(rewards)和概率预测奖励(pro_predict_rewards)等字段。数据集分为训练集(train),包含5000个示例,大小为20543991字节。但是具体的应用场景和详细描述没有在README中提供。
The dataset includes fields such as index (idx), ground truth (gt), prompt information (prompt), difficulty level (level), type (type), solution (solution), users solution (my_solu), prediction (pred), reward flag (rewards), and probability prediction reward (pro_predict_rewards). The dataset is split into a training set (train) with 5000 examples, totaling 20543991 bytes in size. However, the specific application scenario and detailed description are not provided in the README.
提供机构:
selfcorrexp2



