mytestdpo/type12_7ktype3_6ktype4_llama3it_gsm8k
收藏Hugging Face2025-01-16 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/mytestdpo/type12_7ktype3_6ktype4_llama3it_gsm8k
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本对和标签,其中文本对由选中的文本(chosen_txt)和拒绝的文本(rejected_txt)组成,标签包括gt(可能是真实标签)、chosen(选中的文本标签)和rejected(拒绝的文本标签)。此外,还包含一个提示字段(prompt)和一个边缘字段(margin,类型为float64)。数据集分为训练集(train),共有37479个示例。
The dataset includes text pairs and labels, where the text pairs consist of chosen text (chosen_txt) and rejected text (rejected_txt), and the labels include gt (possibly the ground truth), chosen (label for the chosen text), and rejected (label for the rejected text). Additionally, there is a prompt field and a margin field (of type float64). The dataset is split into a training set (train) with a total of 37479 examples.
提供机构:
mytestdpo



