PahaII/OpenResearcher-Data
收藏Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/PahaII/OpenResearcher-Data
下载链接
链接失效反馈官方服务:
资源简介:
OpenResearcher Data数据集是用于训练OpenResearcher(Qwen3.5-35B-A3B)深度研究代理的训练数据。该数据集包含多个配置文件,涉及强化学习(RL)和监督微调(SFT)的训练数据。数据集的文件映射详细列出了各个文件的路径、行数、唯一qids和用途。数据集的模式(Schema)描述了所有RL parquets共享的verl格式,以及添加了passrate相关列的特定文件。此外,数据集还提供了pass-rate评估的详细信息,包括聚合pass-rate和正确性分布。README还列出了三个精心策划的1k子集用于RL消融研究,并说明了数据集的预期用途,如重现SFT和RLOO训练、难度分层的RL实验、拒绝采样微调/课程学习以及pass-rate条件消融。
The OpenResearcher Data dataset is training data used for the OpenResearcher (Qwen3.5-35B-A3B) deep-research agent. The dataset includes multiple configurations involving reinforcement learning (RL) and supervised fine-tuning (SFT) training data. The file map details the paths, number of rows, unique qids, and purposes of each file. The datasets schema describes the verl format shared by all RL parquets, as well as specific files with added passrate-related columns. Additionally, the dataset provides detailed pass-rate evaluation information, including aggregate pass-rates and correctness distributions. The README also lists three curated 1k subsets for RL ablation studies and outlines the intended uses of the dataset, such as reproducing SFT and RLOO training, difficulty-stratified RL experiments, rejection-sampling fine-tuning/curriculum learning, and pass-rate-conditioned ablations.
提供机构:
PahaII



