meituan-longcat/R-HORIZON-Websearch

Name: meituan-longcat/R-HORIZON-Websearch
Creator: meituan-longcat
Published: 2025-10-21 12:46:27
License: 暂无描述

Hugging Face2025-10-21 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/meituan-longcat/R-HORIZON-Websearch

下载链接

链接失效反馈

官方服务：

资源简介：

R-HORIZON 是一个旨在激发大型推理模型（LRM）中长跨度推理行为的新方法，通过查询组合将孤立的问题转化为复杂的多步骤推理场景。它揭示了即使在面对跨长推理跨度的相互依赖问题时，最先进的 LRM 也会遭受显著的性能下降。R-HORIZON 基准包括来自数学、代码生成和代理应用领域的 6 个代表性数据集，并通过使用长跨度推理数据使强化学习具有验证奖励 (RLVR)。该数据集的结构包括输入问题、实例 ID、原始实例 ID、目标答案、问题数量以及所选变量等信息。

R-HORIZON is a novel method designed to stimulate long-horizon reasoning behaviors in Large Reasoning Models (LRMs) through query composition. It transforms isolated problems into complex multi-step reasoning scenarios, revealing that even the most advanced LRMs suffer significant performance degradation when facing interdependent problems that span long reasoning horizons. The R-HORIZON Benchmark comprises 6 representative datasets from mathematics, code generation, and agent applications, and enables reinforcement learning with verified rewards (RLVR) using long-horizon reasoning data. The dataset structure includes input, instance ID, origin instance IDs, target, number of problems, and selected variables.

提供机构：

meituan-longcat

搜集汇总

背景与挑战

背景概述

R-HORIZON-Websearch是一个专注于长跨度推理的数据集，旨在通过查询组合将简单问题转化为多步骤推理场景，以评估和提升大型推理模型的性能。它包含来自数学、代码生成和代理应用等领域的6个子数据集，并支持强化学习验证奖励方法。该数据集的结构化设计包括输入问题、答案和变量信息，有助于揭示模型在复杂推理任务中的局限性。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集