allenai/RLVR-IFeval

Name: allenai/RLVR-IFeval
Creator: allenai
Published: 2024-11-21 07:17:40
License: 暂无描述

Hugging Face2024-11-21 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/allenai/RLVR-IFeval

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含用于强化学习与可验证奖励的指令跟随数据，格式适用于open-instruct。数据是通过从Tulu 2 SFT mixture中采样并随机添加IFEval的约束生成的。每个数据点包含标准指令调优数据点，如messages（用于提示模型的输入）、ground_truth（传递给验证函数的参数）、dataset（样本所属的数据集）、constraint_type（提示中的约束类型）和constraint（用普通英语描述的约束）。

This dataset contains instruction following data formatted for use with open-instruct, specifically for reinforcement learning with verifiable rewards. The prompts in the dataset are generated by sampling from the Tulu 2 SFT mixture and randomly adding constraints from IFEval. Each example includes a list of messages (inputs used to prompt the model), ground truth (arguments passed to the verifying function as a JSON blob), dataset source, constraint type, and constraint description.

提供机构：

allenai

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集是用于指令遵循任务的数据集，专门为支持可验证奖励的强化学习而设计。其特点在于通过从Tulu 2 SFT混合数据中采样提示并随机添加来自IFEval的约束，构建了结构化数据，每个样本包含消息、真实值等字段，以促进模型的指令微调和性能验证。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集