IF-Verifier-Data
收藏魔搭社区2025-10-28 更新2025-07-19 收录
下载链接:
https://modelscope.cn/datasets/THU-KEG/IF-Verifier-Data
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for Dataset Name
## Dataset Details
### Dataset Description
<!-- Provide a longer summary of what this dataset is. -->
- **Curated by:** Hao Peng@THUKEG
- **Language(s) (NLP):** English, Chinese
- **License:** apache-2.0
### Dataset Sources [optional]
<!-- Provide the basic links for the dataset. -->
- **Repository:** https://github.com/THU-KEG/VerIF
- **Paper:** https://arxiv.org/abs/2506.09942
## Uses
This data is used for training generative reward models for instruction-following.
## Dataset Structure
The data is in `jsonl` format, with each line being a json item with the following format:
```
{
"id": <data id>,
"messages": [
{"role": "user", "content": <user query>},
{"role": "assistant", "content": <response from QwQ 32B>}
]
}
```
## Dataset Creation
### Source Data
The original data is WildChat (https://huggingface.co/datasets/allenai/WildChat) and InfinityInstruct (https://huggingface.co/datasets/BAAI/Infinity-Instruct).
#### Data Collection and Processing
We first generate an additional **20,000** data instances as in [VerInstruct](https://huggingface.co/datasets/Wesleythu/Crab-VerIF). To ensure diversity, we additionally mined complex instructions from WildChat and Infinity Instruct~. Specifically, we use Qwen2.5-72B-Instruct to extract constraints from each instruction and classify them as hard or soft. For hard constraints, we adopt Qwen2.5-72B-Instruct to generate corresponding verification Python code scripts. For each instruction, we randomly sample a response from *6* different models, including Llama3.1-8B-Instruct, Llama-3.3-70B-Instruct, Qwen2.5-7B-Instruct, Qwen2.5-72B-Instruct, QwQ-32B, DeepSeek-R1-Distilled-Qwen-32B. We then adopt QwQ-32B to generate a step-by-step verification indicating whether the output satisfies the instruction for each instruction-response pair. As a result, we collect about $130$k instruction–response pairs with corresponding step-by-step verification.
For more details, please refer to our paper and out GitHub [repo](https://github.com/THU-KEG/VerIF).
## Citation
```
@misc{peng2025verif,
title={VerIF: Verification Engineering for Reinforcement Learning in Instruction Following},
author={Hao Peng and Yunjia Qi and Xiaozhi Wang and Bin Xu and Lei Hou and Juanzi Li},
year={2025},
eprint={2506.09942},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.09942},
}
```
## Dataset Card Contact
Please contact [peng-h24@mails.tsinghua.edu.cn] if you have any questions.
# 数据集卡片(Dataset Card)
## 数据集详情
### 数据集描述
<!-- 请提供该数据集的详细概述。 -->
- **数据整理者:** 郝鹏@清华知识工程组(THUKEG)
- **(自然语言处理领域)使用语言:** 英语、中文
- **许可证:** Apache-2.0
### 数据集来源(可选)
<!-- 请提供该数据集的基础链接。 -->
- **代码仓库:** https://github.com/THU-KEG/VerIF
- **相关论文:** https://arxiv.org/abs/2506.09942
## 数据集用途
本数据集用于训练面向指令遵循任务的生成式奖励模型。
## 数据集结构
本数据集采用JSON Lines(jsonl)格式存储,每一行均为一条符合以下格式的JSON条目:
json
{
"id": <数据编号>,
"messages": [
{"role": "user", "content": <用户查询内容>},
{"role": "assistant", "content": <QwQ 32B生成的助手回复>}
]
}
## 数据集构建
### 源数据
本数据集的原始数据为WildChat(https://huggingface.co/datasets/allenai/WildChat)与InfinityInstruct(https://huggingface.co/datasets/BAAI/Infinity-Instruct)。
#### 数据收集与处理流程
我们首先参考VerInstruct(https://huggingface.co/datasets/Wesleythu/Crab-VerIF)额外生成了20000条数据实例。为保证数据多样性,我们额外从WildChat与InfinityInstruct中挖掘复杂指令。具体而言,我们使用Qwen2.5-72B-Instruct从每条指令中提取约束条件,并将其划分为硬约束与软约束。针对硬约束,我们采用Qwen2.5-72B-Instruct生成对应的验证Python代码脚本。对于每条指令,我们从6种不同模型中随机采样一条回复,涵盖的模型包括:Llama3.1-8B-Instruct、Llama-3.3-70B-Instruct、Qwen2.5-7B-Instruct、Qwen2.5-72B-Instruct、QwQ-32B、DeepSeek-R1-Distilled-Qwen-32B。随后我们使用QwQ-32B为每条指令-回复对生成分步验证内容,用于判断该回复是否符合指令要求。最终我们共收集到约13万条指令-回复对及对应的分步验证内容。
如需了解更多细节,请参阅我们的论文及GitHub代码仓库(https://github.com/THU-KEG/VerIF)。
## 引用格式
bibtex
@misc{peng2025verif,
title={VerIF: 面向指令遵循任务强化学习的验证工程},
author={Hao Peng and Yunjia Qi and Xiaozhi Wang and Bin Xu and Lei Hou and Juanzi Li},
year={2025},
eprint={2506.09942},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.09942},
}
## 数据集卡片联系方式
如有任何疑问,请联系邮箱:peng-h24@mails.tsinghua.edu.cn
提供机构:
maas
创建时间:
2025-07-15



