five

yoonsanglee/deepsearchqa-react

收藏
Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/yoonsanglee/deepsearchqa-react
下载链接
链接失效反馈
官方服务:
资源简介:
AggAgent是一个代理聚合框架,通过在测试时从基础代理中采样多个并行滚动来扩展长视野代理,然后聚合它们的证据和解决方案。本数据集卡发布了AggAgent消耗的ReAct基础滚动,即在任何聚合步骤之前生成的单代理轨迹。每个滚动是通过运行一个ReAct风格的深度研究代理(推理→工具调用→观察→...→最终解决方案)对基准提示生成的。代理框架是从Tongyi DeepResearch改编而来。轨迹包括完整的消息流、提取的预测、工具/滚动成本核算以及自动判断结果,因此它们可以直接用于Best-of-N选择、聚合器训练或基础策略的行为分析。本次发布涵盖了三个开放权重的骨干模型:GLM-4.7-Flash、MiniMax-M2.5和Qwen3.5-122B-A10B。每个骨干模型作为一个单独的Parquet文件发布。每个基准实例存储了8个并行滚动(参见metadata)。

AggAgent is an agentic aggregation framework that scales long-horizon agents at test time by sampling multiple parallel rollouts from a base agent and then aggregating their evidence and solutions. This dataset card releases the ReAct base rollouts that AggAgent consumes, i.e. single-agent trajectories produced before any aggregation step. Each rollout was generated by running a ReAct-style deep-research agent (reasoning → tool call → observation → ... → final solution) against the benchmark prompts. The agent scaffold is adapted from Tongyi DeepResearch. The trajectories include the full message stream, the extracted prediction, tool/rollout cost accounting, and an auto-judge verdict, so they can be used directly for Best-of-N selection, aggregator training, or behavioural analysis of the base policy. This release covers three open-weights backbones: GLM-4.7-Flash, MiniMax-M2.5, and Qwen3.5-122B-A10B. Each backbone is shipped as a single Parquet file. roll_out_count = 8 parallel rollouts are stored per benchmark instance (see metadata).
提供机构:
yoonsanglee
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作