RyanYr/grpo-dapo-qwen3-4B-Base-mbs128-n4_mmlupro

Name: RyanYr/grpo-dapo-qwen3-4B-Base-mbs128-n4_mmlupro
Creator: RyanYr
Published: 2026-04-27 06:03:38
License: 暂无描述

Hugging Face2026-04-27 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/RyanYr/grpo-dapo-qwen3-4B-Base-mbs128-n4_mmlupro

下载链接

链接失效反馈

官方服务：

资源简介：

这是一个用于奖励模型评估的数据集，包含结构化特征：提示（由角色和内容组成）、数据源、奖励模型信息（包括真实值和风格）以及多个响应。数据集被分割为10个测试子集（从test.10到test.100），每个子集包含12032个示例，适用于测试和评估模型在不同条件下的性能表现。

This is a dataset for reward model evaluation, containing structured features: prompt (composed of role and content), data source, reward model information (including ground truth and style), and multiple responses. The dataset is divided into 10 test subsets (from test.10 to test.100), each with 12032 examples, suitable for testing and evaluating model performance under different conditions.

提供机构：

RyanYr

5,000+

优质数据集

54 个

任务类型

进入经典数据集