yoonsanglee/deepsearchqa-react

Name: yoonsanglee/deepsearchqa-react
Creator: yoonsanglee
Published: 2026-04-29 18:42:39
License: 暂无描述

Hugging Face2026-04-29 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/yoonsanglee/deepsearchqa-react

下载链接

链接失效反馈

官方服务：

资源简介：

AggAgent是一个代理聚合框架，通过在测试时从基础代理中采样多个并行滚动来扩展长视野代理，然后聚合它们的证据和解决方案。本数据集卡发布了AggAgent消耗的ReAct基础滚动，即在任何聚合步骤之前生成的单代理轨迹。每个滚动是通过运行一个ReAct风格的深度研究代理（推理→工具调用→观察→...→最终解决方案）对基准提示生成的。代理框架是从Tongyi DeepResearch改编而来。轨迹包括完整的消息流、提取的预测、工具/滚动成本核算以及自动判断结果，因此它们可以直接用于Best-of-N选择、聚合器训练或基础策略的行为分析。本次发布涵盖了三个开放权重的骨干模型：GLM-4.7-Flash、MiniMax-M2.5和Qwen3.5-122B-A10B。每个骨干模型作为一个单独的Parquet文件发布。每个基准实例存储了8个并行滚动（参见metadata）。

AggAgent is an agentic aggregation framework that scales long-horizon agents at test time by sampling multiple parallel rollouts from a base agent and then aggregating their evidence and solutions. This dataset card releases the ReAct base rollouts that AggAgent consumes, i.e. single-agent trajectories produced before any aggregation step. Each rollout was generated by running a ReAct-style deep-research agent (reasoning → tool call → observation → ... → final solution) against the benchmark prompts. The agent scaffold is adapted from Tongyi DeepResearch. The trajectories include the full message stream, the extracted prediction, tool/rollout cost accounting, and an auto-judge verdict, so they can be used directly for Best-of-N selection, aggregator training, or behavioural analysis of the base policy. This release covers three open-weights backbones: GLM-4.7-Flash, MiniMax-M2.5, and Qwen3.5-122B-A10B. Each backbone is shipped as a single Parquet file. roll_out_count = 8 parallel rollouts are stored per benchmark instance (see metadata).

提供机构：

yoonsanglee

5,000+

优质数据集

54 个

任务类型

进入经典数据集