mytestdpo/llama3_sft_gsm8k_sft_model_gen2_auggsm8k_
收藏Hugging Face2025-01-19 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/mytestdpo/llama3_sft_gsm8k_sft_model_gen2_auggsm8k_
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了一系列的提示(prompt)和对应的答案序列(answers),以及真实标签(gt)。此外,数据集中还包含了关于奖励(first_rewards和second_rewards)和预测(prediction)的信息。数据集被划分为训练集(train),大小为367,151,532字节,共有17,472个示例。数据集的具体应用场景和内容未在README中描述。
The dataset consists of a series of prompts and corresponding answer sequences, along with ground truth labels. Additionally, the dataset includes information about rewards (first_rewards and second_rewards) and predictions (prediction). The dataset is split into a training set (train), which is 367,151,532 bytes in size and contains 17,472 examples. The specific application scenario and content of the dataset are not described in the README.
提供机构:
mytestdpo



