mytestdpo/llama3_it_8b_gsm8k_kumar_baselinetmp10
收藏Hugging Face2024-12-30 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/mytestdpo/llama3_it_8b_gsm8k_kumar_baselinetmp10
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了索引(idx)、真实标签(gt)、提示文本(prompt)、答案(answer)、用户解决方案(my_solu)、预测结果(pred)和奖励(rewards)等字段。数据集被划分为训练集,共有3957个示例。数据集的具体应用场景和目的未在README中说明。
The dataset includes fields such as index (idx), ground truth (gt), prompt text (prompt), answer (answer), user solution (my_solu), prediction (pred), and reward (rewards). The dataset is split into a training set with a total of 3957 examples. The specific application scenario and purpose of the dataset are not described in the README.
提供机构:
mytestdpo



