WebInstruct-verified-unfiltered

Name: WebInstruct-verified-unfiltered
Creator: maas
Published: 2025-12-05 16:39:45
License: 暂无描述

魔搭社区2025-12-05 更新2025-06-28 收录

下载链接：

https://modelscope.cn/datasets/TIGER-Lab/WebInstruct-verified-unfiltered

下载链接

链接失效反馈

官方服务：

资源简介：

**This repo contains the unfiltered version [WebInstruct-verified](https://huggingface.co/datasets/TIGER-Lab/WebInstruct-verified) in the General Reasoner work.** ## General-Reasoner: Advancing LLM Reasoning Across All Domains <a href="https://github.com/TIGER-AI-Lab/General-Reasoner" target="_blank">💻 Code</a> | <a href="https://arxiv.org/abs/2505.14652" target="_blank">📄 Paper</a> | <a href="https://huggingface.co/datasets/TIGER-Lab/WebInstruct-verified" target="_blank">📊 Dataset</a> | <a href="https://huggingface.co/collections/TIGER-Lab/general-reasoner-67fe9386e43e046489eac013" target="_blank">🤗 Model</a> | <a href="https://tiger-ai-lab.github.io/General-Reasoner/" target="_blank">🌐 Project Page</a> ## Overview <img src="https://tiger-ai-lab.github.io/General-Reasoner/static/images/teaser.png" alt="General-Reasoner Teaser" width="650"/> Figure: Effectiveness of General-Reasoner trained with diverse verifiable reasoning questions using model-based verifier compared to baseline methods on various reasoning tasks. **General-Reasoner** is a training paradigm for large language models (LLMs), designed to robustly enhance reasoning abilities across diverse domains—not just mathematics and coding, but also physics, chemistry, finance, humanities, and more. **Key features:** - **Zero RL Training:** Direct reinforcement learning from base LLMs, bypassing intermediate supervised stages. - **Diverse Reasoning Data:** 230K+ high-quality, verifiable questions sourced from the web and filtered for answer verifiability across disciplines. - **Model-Based Verifier:** Compact 1.5B generative verifier model for context-aware, chain-of-thought answer validation, outperforming traditional rule-based methods. ## Dataset Details We construct a diverse, high‑quality dataset to facilitate robust reasoning capabilities across a broad range of domains, extending beyond the commonly studied mathematical problems. - **We trace back the data in WebInstruct to its original web page to re‑crawl the question–answer pairs.** If the original page lacks human‑written answers, we drop the entry. This ensures every re‑crawled item is human‑verified and, therefore, that each answer is of reliable quality. - **Gemini‑1.5‑Pro is employed to selectively extract questions with clearly verifiable short answers,** further boosting dataset reliability. - **Gemini‑2.0‑Flash then generates eight candidate answers per question for additional filtering:** - We discard any question for which **all eight Gemini‑generated answers are incorrect**, eliminating ambiguous or noisy items that arose during web scraping. - We also remove **overly simple questions**—those for which **all eight candidate answers are correct**—to preserve dataset complexity and better challenge model generalization. These steps ensure the correctness of the constructed dataset. ## Distribution The distribution of disciplines is depicted as follows: <img src="https://cdn-uploads.huggingface.co/production/uploads/6313a86154e6e5d9f0f94e04/I_TplgIibmBM_A_nwZh7B.png" width="600"/> ## Verification The short answers have different forms, including float, array, matrix, latex, etc. To verifify these answers, please use GPT/Gemini or use the locally-served model at https://huggingface.co/TIGER-Lab/general-verifier. ## Citation If you feel our work is helpful, please cite: ```bibtex @article{general-reasoner, title={{G}eneral-{R}easoner: Advancing LLM Reasoning Across All Domains}, author={Xueguang Ma and Qian Liu and Dongfu Jiang and Ge Zhang and Zejun Ma and Wenhu Chen}, year={2025}, journal={arXiv:2505.14652}, url={https://arxiv.org/abs/2505.14652} } ```

**本仓库包含了《General Reasoner》研究工作中未经过滤的[WebInstruct-verified](https://huggingface.co/datasets/TIGER-Lab/WebInstruct-verified)版本。** ## General-Reasoner：推进跨领域大语言模型推理能力 <a href="https://github.com/TIGER-AI-Lab/General-Reasoner" target="_blank">💻 代码</a> | <a href="https://arxiv.org/abs/2505.14652" target="_blank">📄 论文</a> | <a href="https://huggingface.co/datasets/TIGER-Lab/WebInstruct-verified" target="_blank">📊 数据集</a> | <a href="https://huggingface.co/collections/TIGER-Lab/general-reasoner-67fe9386e43e046489eac013" target="_blank">🤗 模型</a> | <a href="https://tiger-ai-lab.github.io/General-Reasoner/" target="_blank">🌐 项目主页</a> ## 概述 <img src="https://tiger-ai-lab.github.io/General-Reasoner/static/images/teaser.png" alt="General-Reasoner 示意图" width="650"/> 图：相较于基线方法，使用基于模型的验证器结合多样化可验证推理问题训练得到的General-Reasoner在各类推理任务上的表现效果。 **General-Reasoner** 是面向大语言模型（Large Language Model, LLM）的训练范式，旨在全面强化模型在多领域的推理能力——不仅局限于数学与编码领域，还覆盖物理、化学、金融、人文社科等诸多学科。 **核心特性：** - **零强化学习训练（Zero RL Training）：** 直接基于基础大语言模型开展强化学习，无需中间监督学习阶段。 - **多样化推理数据：** 包含23万余条高质量、可验证的问题，均源自网络，并经过多学科答案可验证性筛选。 - **基于模型的验证器（Model-Based Verifier）：** 采用轻量化的15亿参数生成式验证模型，实现上下文感知的链式思考答案验证，性能优于传统基于规则的方法。 ## 数据集详情我们构建了多样化的高质量数据集，以助力模型在广泛领域中实现稳健的推理能力，突破了以往研究仅聚焦数学问题的局限。 - **我们回溯WebInstruct数据集的原始网页，重新爬取问题-答案对。** 若原始页面未包含人工撰写的答案，则剔除该条目。此举确保每一条重新爬取的样本均经过人工验证，因此所有答案均具备可靠质量。 - **使用Gemini-1.5-Pro选择性提取具备明确可验证短答案的问题，** 进一步提升数据集的可靠性。 - **随后由Gemini-2.0-Flash为每个问题生成8条候选答案，用于额外筛选：** - 剔除所有8条Gemini生成答案均错误的问题，消除网络爬取过程中产生的歧义或噪声样本。 - 同时移除**过于简单的问题**——即8条候选答案全部正确的问题，以保留数据集的复杂度，更好地挑战模型的泛化能力。上述步骤确保了所构建数据集的正确性。 ## 数据分布各学科的数据分布如下所示： <img src="https://cdn-uploads.huggingface.co/production/uploads/6313a86154e6e5d9f0f94e04/I_TplgIibmBM_A_nwZh7B.png" width="600"/> ## 验证说明短答案存在多种形式，包括浮点数、数组、矩阵、LaTeX公式等。如需验证此类答案，可使用GPT、Gemini模型，或通过https://huggingface.co/TIGER-Lab/general-verifier 部署的本地服务模型进行验证。 ## 引用若您认为本工作对您有所帮助，请引用如下文献： bibtex @article{general-reasoner, title={{G}eneral-{R}easoner: Advancing LLM Reasoning Across All Domains}, author={Xueguang Ma and Qian Liu and Dongfu Jiang and Ge Zhang and Zejun Ma and Wenhu Chen}, year={2025}, journal={arXiv:2505.14652}, url={https://arxiv.org/abs/2505.14652} }

提供机构：

maas

创建时间：

2025-06-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集