selfcorrexp2/llama31_first_wrong_math_merged
收藏Hugging Face2024-12-22 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/selfcorrexp2/llama31_first_wrong_math_merged
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了一系列的对话,每个对话包含多个轮次。每个轮次包括了提示信息、多个答案选项、是否为第一轮的标记、正确答案、是否获得奖励的标记、参与者提供的解决方案、一个标志位、对话轮次信息,以及对话中每个参与者的内容和角色。数据集提供了一个训练集,可用于模型训练。
The dataset consists of a series of conversations, each with multiple turns. Each turn includes prompt information, multiple answer options, a flag indicating if its the first round, the correct answer, a reward marker, a participants proposed solution, a flag, the turn information, and the content and role of each participant in the conversation. The dataset provides a training set for model training.
提供机构:
selfcorrexp2



