DocQA-RL-1.6K
收藏魔搭社区2026-05-14 更新2025-05-31 收录
下载链接:
https://modelscope.cn/datasets/iic/DocQA-RL-1.6K
下载链接
链接失效反馈官方服务:
资源简介:
To construct a challenging RL dataset for verifiable long-context reasoning, we develop [🤗 DocQA-RL-1.6K](https://huggingface.co/datasets/Tongyi-Zhiwen/DocQA-RL-1.6K), which comprises 1.6K DocQA problems across three reasoning domains:
(1) Mathematical Reasoning: We use 600 problems from the DocMath dataset, requiring numerical reasoning across long and specialized documents such as financial reports. For DocMath, we sample 75% items from each subset from its valid split for training and 25% for evaluation;
(2) Logical Reasoning: We employ DeepSeek-R1 to synthesize 600 multi-choice questions requiring logic analysis of real-world documents spanning legal, financial, insurance, and production domains from our curated collection;
(3) Multi-Hop Reasoning: We sample 200 examples from MultiHopRAG and 200 examples from Musique, emphasizing cross-document reasoning.
为构建可验证长上下文推理的挑战性强化学习(Reinforcement Learning, RL)数据集,我们开发了🤗 DocQA-RL-1.6K(数据集链接:https://huggingface.co/datasets/Tongyi-Zhiwen/DocQA-RL-1.6K),该数据集包含覆盖三大推理领域的1600个DocQA问题:
(1) 数学推理:我们从DocMath数据集中选取600道题目,需针对金融报告等长文本专业文档开展数值推理。针对DocMath数据集,我们从其验证划分的每个子集中采样75%的样本用于训练,剩余25%用于模型评估;
(2) 逻辑推理:我们采用DeepSeek-R1生成600道多项选择题,需对我们整理的涵盖法律、金融、保险及生产领域的真实文档进行逻辑分析;
(3) 多跳推理:我们分别从MultiHopRAG与Musique数据集中各采样200个样本,重点聚焦跨文档推理任务。
提供机构:
maas
创建时间:
2025-05-23



