selfcorrexp/llama3_it_firstwrong_try2_merged

Name: selfcorrexp/llama3_it_firstwrong_try2_merged
Creator: selfcorrexp
Published: 2025-01-26 11:16:57
License: 暂无描述

Hugging Face2025-01-26 更新2025-02-15 收录

下载链接：

https://hf-mirror.com/datasets/selfcorrexp/llama3_it_firstwrong_try2_merged

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含四个字段：答案（answers）、提示（prompt）、真实标签（gt）和二次奖励（second_rewards）。其中答案和二次奖励是文本序列，提示和真实标签是文本类型。二次奖励字段为布尔类型。数据集仅包含训练集，共有110000个示例，总大小为684923470字节。数据集的下载大小为244969859字节。

The dataset includes four fields: answers, prompt, ground truth (gt), and second_rewards. Answers and second_rewards are text sequences, while prompt and gt are string types. The second_rewards field is of boolean type. The dataset consists only of a training set with 110,000 examples, totaling 684,923,470 bytes in size. The download size of the dataset is 244,969,859 bytes.

提供机构：

selfcorrexp

5,000+

优质数据集

54 个

任务类型

进入经典数据集