selfcorrexp/llama3_regular_balanced_sft_4_ORM_training

Name: selfcorrexp/llama3_regular_balanced_sft_4_ORM_training
Creator: selfcorrexp
Published: 2024-12-29 00:10:45
License: 暂无描述

Hugging Face2024-12-29 更新2025-02-15 收录

下载链接：

https://hf-mirror.com/datasets/selfcorrexp/llama3_regular_balanced_sft_4_ORM_training

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含对话信息，每个样本包括索引、提示、是否为第一轮对话、真实标签、奖励序列标志、标志位、对话轮次和具体的对话内容（包括对话内容和角色）。训练集包含173928个样本。

The dataset contains conversational information, with each sample including an index, prompt, whether its the first round of conversation, ground truth label, reward sequence flag, a flag, the turn of the conversation, and specific conversation content (including the text of the conversation and the role). The training set contains 173928 samples.

提供机构：

selfcorrexp

5,000+

优质数据集

54 个

任务类型

进入经典数据集