research-plan-gen
收藏魔搭社区2026-01-06 更新2026-01-10 收录
下载链接:
https://modelscope.cn/datasets/facebook/research-plan-gen
下载链接
链接失效反馈官方服务:
资源简介:
# RPG Dataset
Research Plan Generation dataset with three subsets covering ML, Arxiv, and PubMed research papers. Each subset contains research tasks with evaluation rubrics and reference solutions.
## Dataset Statistics
| Subset | Train | Test | Total |
|--------|-------|------|-------|
| ML | 6,872 | 685 | 7,557 |
| Arxiv | 6,573 | 1,496 | 8,069 |
| Pubmed | 6,423 | 464 | 6,887 |
| **Total** | **19,868** | **2,645** | **22,513** |
## Loading the Dataset
```python
from datasets import load_dataset
# Load a specific subset
ml_data = load_dataset("facebook/research-plan-gen", "ml")
arxiv_data = load_dataset("facebook/research-plan-gen", "arxiv")
pubmed_data = load_dataset("facebook/research-plan-gen", "pubmed")
# Access splits
train_data = ml_data['train']
test_data = ml_data['test']
# Get a sample
sample = train_data[0]
print(sample['Goal'])
```
## Dataset Schema
Each sample contains:
- **Goal** (string): The research task or objective to be accomplished
- **Rubric** (list of strings): List of evaluation criteria for assessing the generated plan
- **Reference solution** (string): A reference solution, which is a Llama4-maverick generated summary of how the authors addressed the research task
- **article_id** (string): Unique identifier for the source article
- **q_id** (string): Question/task identifier (is a sha256 first 16 chars hash of the goal)
- **Subdomain** (string): Research subdomain (populated for Arxiv subset, empty string for ML and Pubmed)
- **Category** (string): Research category (populated for Arxiv subset and ML test, empty string for ML train and Pubmed)
- **Identifier** (string): Additional identifier field to find the original paper. Openreview forum id for ml papers, arxiv identifier for arxiv papers, pmid for pubmed papers.
## Example
```python
{
'Goal': 'You are tasked with fine-tuning a Large Multimodal Model...',
'Rubric': [
'The proposed method should be parameter-efficient...',
'The method should allow for intuitive control...',
...
],
'Reference solution': 'To fine-tune a Large Multimodal Model...',
'article_id': 'zxg6601zoc',
'q_id': 'a396a61f2da8ce60',
'Subdomain': '',
'Category': '',
'Identifier': 'zxg6601zoc'
}
```
## Citation
If you use this dataset, please cite:
```
@article{goel2025training,
title={Training AI Co-Scientists Using Rubric Rewards},
author={Goel, Shashwat and Hazra, Rishi and Jayalath, Dulhan and Willi, Timon and Jain, Parag and Shen, William F and Leontiadis, Ilias and Barbieri, Francesco and Bachrach, Yoram and Geiping, Jonas and Whitehouse, Chenxi},
journal={arXiv preprint arXiv:2512.23707},
year={2025}
}
```
## License
The Data is released CC-by-NC and is intended for benchmarking purposes only. The goal, grading rubrics and solutions are outputs of Llama 4, and subject to the Llama 4 license ( https://github.com/meta-llama/llama-models/tree/main/models/llama4). If you use this portion of the data to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name. Third party content pulled from other locations are subject to its own licenses and you may have other legal obligations or restrictions that govern your use of that content.
# RPG数据集
本数据集为**研究计划生成(Research Plan Generation)**数据集,包含三个子数据集,分别覆盖机器学习(Machine Learning, ML)、arXiv与PubMed领域的学术论文。每个子数据集均包含带评估准则与参考解决方案的研究任务。
## 数据集统计信息
| 子数据集 | 训练集 | 测试集 | 总计 |
|--------|-------|------|-------|
| ML | 6,872 | 685 | 7,557 |
| arXiv | 6,573 | 1,496 | 8,069 |
| PubMed | 6,423 | 464 | 6,887 |
| **总计** | **19,868** | **2,645** | **22,513** |
## 加载数据集
python
from datasets import load_dataset
# 加载指定子数据集
ml_data = load_dataset("facebook/research-plan-gen", "ml")
arxiv_data = load_dataset("facebook/research-plan-gen", "arxiv")
pubmed_data = load_dataset("facebook/research-plan-gen", "pubmed")
# 访问数据划分
train_data = ml_data['train']
test_data = ml_data['test']
# 获取单条样本
sample = train_data[0]
print(sample['Goal'])
## 数据集结构
每条样本包含以下字段:
- **Goal**(字符串类型):待完成的研究任务或目标
- **Rubric**(字符串列表):用于评估生成计划的评估准则列表
- **Reference solution**(字符串类型):参考解决方案,即由Llama4-maverick生成的、描述原论文作者如何解决该研究任务的摘要
- **article_id**(字符串类型):源学术论文的唯一标识符
- **q_id**(字符串类型):任务/问题标识符,为`Goal`字段内容的SHA256哈希值的前16个字符
- **Subdomain**(字符串类型):研究子领域(仅arXiv子数据集填充该字段,ML与PubMed子数据集为空字符串)
- **Category**(字符串类型):研究分类(仅arXiv子数据集与ML测试集填充该字段,ML训练集与PubMed子数据集为空字符串)
- **Identifier**(字符串类型):用于定位原始论文的附加标识符。ML论文对应OpenReview论坛ID,arXiv论文对应arXiv标识符,PubMed论文对应PubMed编号(PMID)
## 示例
python
{
'Goal': '您需要完成的任务为微调大型多模态模型(Large Multimodal Model)...',
'Rubric': [
'所提方法应具备参数高效性...',
'该方法应支持直观可控性...',
...
],
'Reference solution': '若要微调大型多模态模型...',
'article_id': 'zxg6601zoc',
'q_id': 'a396a61f2da8ce60',
'Subdomain': '',
'Category': '',
'Identifier': 'zxg6601zoc'
}
## 引用
若您使用本数据集,请引用如下文献:
@article{goel2025training,
title={Training AI Co-Scientists Using Rubric Rewards},
author={Goel, Shashwat and Hazra, Rishi and Jayalath, Dulhan and Willi, Timon and Jain, Parag and Shen, William F and Leontiadis, Ilias and Barbieri, Francesco and Bachrach, Yoram and Geiping, Jonas and Whitehouse, Chenxi},
journal={arXiv preprint arXiv:2512.23707},
year={2025}
}
## 许可协议
本数据集采用CC-BY-NC协议发布,仅用于基准测试场景。其中的研究目标、评估准则与解决方案均为Llama 4的生成结果,需遵循Llama 4许可协议(https://github.com/meta-llama/llama-models/tree/main/models/llama4)。若您使用本数据集的内容创建、训练、微调或以其他方式改进并分发或公开AI模型,则需在该AI模型名称的开头添加“Llama”字样。从其他来源获取的第三方内容需遵循其自身的许可协议,您可能需遵守其他法律义务或限制条款来使用此类内容。
提供机构:
maas
创建时间:
2025-12-29



