THUDM/LongReward-10k

Name: THUDM/LongReward-10k
Creator: THUDM
Published: 2024-10-29 02:29:37
License: 暂无描述

Hugging Face2024-10-29 更新2025-04-08 收录

下载链接：

https://hf-mirror.com/datasets/THUDM/LongReward-10k

下载链接

链接失效反馈

官方服务：

资源简介：

LongReward-10k数据集包含10,000个长上下文问答实例，涵盖英文和中文两种语言，每个实例最长可达64,000个单词。数据集分为三个部分：sft部分包含监督微调数据，dpo_glm4_9b和dpo_llama3.1_8b部分为长上下文偏好数据集，用于训练不同的DPO模型。

The LongReward-10k dataset contains 10,000 long-context QA instances in both English and Chinese, with each instance up to 64,000 words in length. The dataset is divided into three parts: the sft part includes supervised fine-tuning data, while the dpo_glm4_9b and dpo_llama3.1_8b parts are long-context preference datasets used for training different DPO models.

提供机构：

THUDM

5,000+

优质数据集

54 个

任务类型

进入经典数据集