IF-Verifier-Data

Name: IF-Verifier-Data
Creator: maas
Published: 2025-10-28 18:02:19
License: 暂无描述

魔搭社区2025-10-28 更新2025-07-19 收录

下载链接：

https://modelscope.cn/datasets/THU-KEG/IF-Verifier-Data

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for Dataset Name ## Dataset Details ### Dataset Description  - **Curated by:** Hao Peng@THUKEG - **Language(s) (NLP):** English, Chinese - **License:** apache-2.0 ### Dataset Sources [optional]  - **Repository:** https://github.com/THU-KEG/VerIF - **Paper:** https://arxiv.org/abs/2506.09942 ## Uses This data is used for training generative reward models for instruction-following. ## Dataset Structure The data is in `jsonl` format, with each line being a json item with the following format: ``` { "id": <data id>, "messages": [ {"role": "user", "content": <user query>}, {"role": "assistant", "content": <response from QwQ 32B>} ] } ``` ## Dataset Creation ### Source Data The original data is WildChat (https://huggingface.co/datasets/allenai/WildChat) and InfinityInstruct (https://huggingface.co/datasets/BAAI/Infinity-Instruct). #### Data Collection and Processing We first generate an additional **20,000** data instances as in [VerInstruct](https://huggingface.co/datasets/Wesleythu/Crab-VerIF). To ensure diversity, we additionally mined complex instructions from WildChat and Infinity Instruct~. Specifically, we use Qwen2.5-72B-Instruct to extract constraints from each instruction and classify them as hard or soft. For hard constraints, we adopt Qwen2.5-72B-Instruct to generate corresponding verification Python code scripts. For each instruction, we randomly sample a response from *6* different models, including Llama3.1-8B-Instruct, Llama-3.3-70B-Instruct, Qwen2.5-7B-Instruct, Qwen2.5-72B-Instruct, QwQ-32B, DeepSeek-R1-Distilled-Qwen-32B. We then adopt QwQ-32B to generate a step-by-step verification indicating whether the output satisfies the instruction for each instruction-response pair. As a result, we collect about $130$k instruction–response pairs with corresponding step-by-step verification. For more details, please refer to our paper and out GitHub [repo](https://github.com/THU-KEG/VerIF). ## Citation ``` @misc{peng2025verif, title={VerIF: Verification Engineering for Reinforcement Learning in Instruction Following}, author={Hao Peng and Yunjia Qi and Xiaozhi Wang and Bin Xu and Lei Hou and Juanzi Li}, year={2025}, eprint={2506.09942}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2506.09942}, } ``` ## Dataset Card Contact Please contact [peng-h24@mails.tsinghua.edu.cn] if you have any questions.

# 数据集卡片（Dataset Card） ## 数据集详情 ### 数据集描述  - **数据整理者：** 郝鹏@清华知识工程组（THUKEG） - **（自然语言处理领域）使用语言：** 英语、中文 - **许可证：** Apache-2.0 ### 数据集来源（可选）  - **代码仓库：** https://github.com/THU-KEG/VerIF - **相关论文：** https://arxiv.org/abs/2506.09942 ## 数据集用途本数据集用于训练面向指令遵循任务的生成式奖励模型。 ## 数据集结构本数据集采用JSON Lines（jsonl）格式存储，每一行均为一条符合以下格式的JSON条目： json { "id": <数据编号>, "messages": [ {"role": "user", "content": <用户查询内容>}, {"role": "assistant", "content": <QwQ 32B生成的助手回复>} ] } ## 数据集构建 ### 源数据本数据集的原始数据为WildChat（https://huggingface.co/datasets/allenai/WildChat）与InfinityInstruct（https://huggingface.co/datasets/BAAI/Infinity-Instruct）。 #### 数据收集与处理流程我们首先参考VerInstruct（https://huggingface.co/datasets/Wesleythu/Crab-VerIF）额外生成了20000条数据实例。为保证数据多样性，我们额外从WildChat与InfinityInstruct中挖掘复杂指令。具体而言，我们使用Qwen2.5-72B-Instruct从每条指令中提取约束条件，并将其划分为硬约束与软约束。针对硬约束，我们采用Qwen2.5-72B-Instruct生成对应的验证Python代码脚本。对于每条指令，我们从6种不同模型中随机采样一条回复，涵盖的模型包括：Llama3.1-8B-Instruct、Llama-3.3-70B-Instruct、Qwen2.5-7B-Instruct、Qwen2.5-72B-Instruct、QwQ-32B、DeepSeek-R1-Distilled-Qwen-32B。随后我们使用QwQ-32B为每条指令-回复对生成分步验证内容，用于判断该回复是否符合指令要求。最终我们共收集到约13万条指令-回复对及对应的分步验证内容。如需了解更多细节，请参阅我们的论文及GitHub代码仓库（https://github.com/THU-KEG/VerIF）。 ## 引用格式 bibtex @misc{peng2025verif, title={VerIF: 面向指令遵循任务强化学习的验证工程}, author={Hao Peng and Yunjia Qi and Xiaozhi Wang and Bin Xu and Lei Hou and Juanzi Li}, year={2025}, eprint={2506.09942}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2506.09942}, } ## 数据集卡片联系方式如有任何疑问，请联系邮箱：peng-h24@mails.tsinghua.edu.cn

提供机构：

maas

创建时间：

2025-07-15

5,000+

优质数据集

54 个

任务类型

进入经典数据集