five

facebook/natural_reasoning

收藏
魔搭社区2026-01-02 更新2025-02-22 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/natural_reasoning
下载链接
链接失效反馈
官方服务:
资源简介:
[NaturalReasoning](https://arxiv.org/abs/2502.13124) is a large-scale dataset for general reasoning tasks. It consists of high-quality challenging reasoning questions backtranslated from pretraining corpora [DCLM](https://github.com/mlfoundations/dclm) and [FineMath](https://huggingface.co/datasets/HuggingFaceTB/finemath). The questions have been deduplicated and decontaminated from popular reasoning benchmarks including MATH, GPQA, MMLU-Pro, MMLU-STEM. For each question, we extract the reference final answer from the original document from the pretraining corpora if possible. We also provide a model-generated response from Llama3.3-70B-Instruct. We release a 1.1 million subset of NaturalReasoning to the research community to foster research on training strong LLM reasoners. You can load the dataset as follows ```python from datasets import load_dataset ds = load_dataset("facebook/natural_reasoning") ``` For more information regarding data collection, please refer to our [paper](https://arxiv.org/abs/2502.13124). ## Reference Answer Statistics In the 1.1 million subset, 18.29% of the questions do not have a reference answer, 9.71% of the questions have a single word answer, 21.58% of the questions have a short answer while 50.42% of the questions have a long reference answer. ## Scaling Curve Training on NaturalReasoning shows better scaling effects than training on other datasets when training Llama3.1-8B-Instruct model. In particular, we measure the average performance on three benchmarks: MATH, GPQA, MMLU-Pro. <img src="https://cdn-uploads.huggingface.co/production/uploads/659a395421a7431643caedda/S6aO-agjRRhc0JLkohZ5z.jpeg" style="width:50%; max-width:400px;"> ## Citation If you use data from NaturalReasoning, please cite with the following BibTex entry: ``` @misc{yuan2025naturalreasoningreasoningwild28m, title={NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions}, author={Weizhe Yuan and Jane Yu and Song Jiang and Karthik Padthe and Yang Li and Dong Wang and Ilia Kulikov and Kyunghyun Cho and Yuandong Tian and Jason E Weston and Xian Li}, year={2025}, eprint={2502.13124}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2502.13124}, } ```

NaturalReasoning(https://arxiv.org/abs/2502.13124)是一款面向通用推理任务的大规模数据集。其高质量挑战性推理问题均从预训练语料库[DCLM(https://github.com/mlfoundations/dclm)]与[FineMath(https://huggingface.co/datasets/HuggingFaceTB/finemath)]中回译得到。所有问题均已完成去重与去污染处理,且未与MATH、GPQA、MMLU-Pro、MMLU-STEM等主流推理基准数据集存在重叠。针对每个问题,我们尽可能从预训练语料库的原始文档中提取参考最终答案;同时还提供了由Llama3.3-70B-Instruct生成的模型响应。 我们面向学术社区发布了NaturalReasoning的110万条样本子集,以推动高性能大语言模型(Large Language Model)推理相关研究的发展。 可通过以下代码加载该数据集: python from datasets import load_dataset ds = load_dataset("facebook/natural_reasoning") 若需了解更多数据收集相关细节,请参阅我们的论文(https://arxiv.org/abs/2502.13124)。 ## 参考答案统计情况 在该110万条样本的子集中,18.29%的问题未配备参考答案,9.71%的问题仅含单个词的答案,21.58%的问题为短答案,另有50.42%的问题拥有长参考答案。 ## 缩放曲线 在训练Llama3.1-8B-Instruct模型时,基于NaturalReasoning进行训练的缩放效果优于其他数据集。具体而言,我们以MATH、GPQA、MMLU-Pro三个基准的平均性能作为评估指标。 <img src="https://cdn-uploads.huggingface.co/production/uploads/659a395421a7431643caedda/S6aO-agjRRhc0JLkohZ5z.jpeg" style="width:50%; max-width:400px;"> ## 引用格式 若您在研究中使用NaturalReasoning数据集,请按以下BibTex格式引用: @misc{yuan2025naturalreasoningreasoningwild28m, title={NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions}, author={Weizhe Yuan and Jane Yu and Song Jiang and Karthik Padthe and Yang Li and Dong Wang and Ilia Kulikov and Kyunghyun Cho and Yuandong Tian and Jason E Weston and Xian Li}, year={2025}, eprint={2502.13124}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2502.13124}, }
提供机构:
maas
创建时间:
2025-02-21
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
NaturalReasoning是一个面向通用推理任务的大规模数据集,包含从预训练语料库反向翻译并经过去重和去污染处理的高质量挑战性问题。该数据集发布了110万个子集,提供参考答案和模型生成的响应,旨在推动强大语言模型推理器的研究。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作