optimal_thinking_bench
收藏魔搭社区2026-01-06 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/facebook/optimal_thinking_bench
下载链接
链接失效反馈官方服务:
资源简介:
This dataset is released as part of [OptimalOptimalThinkingBench](https://arxiv.org/abs/2508.13141) research project.
IMPORTANT: This is only a subset of OptimalThinkingBench that does not contain the math problems. To download the full dataset, please refer to our project materials [here](https://github.com/facebookresearch/RAM/tree/main/projects/otb) for more details.
## Loading the dataset with transformers
This dataset is built using [Llama-4-Maverick](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct) and [Reasoning-Gym](https://github.com/open-thought/reasoning-gym). Details on how to generate this dataset can be found in [OptimalOptimalThinkingBench paper](https://arxiv.org/abs/2508.13141).
Minimal example below showing how to load the dataset.
```python
from datasets import load_dataset
import json
dataset = load_dataset("facebook/OptimalThinkingBench")['train']
for subset in ['overthinkingbench', 'underthinkingbench']:
for row in dataset.filter(lambda x: x['subset'] == subset):
print('Question: ', row['question'])
print('Answer: ', row['answer'])
print('Metadata: ', json.loads(row['metadata']))
print('--------------------------------')
```
## Citation
If you use data, or code from this work, please cite with the following BibTex entry:
```bibtex
@article{aggarwal2025otb,
title={OptimalThinkingBench: Evaluating Over and Underthinking in LLMs},
author={Aggarwal, Pranjal and Kim, Seungone and Lanchantin, Jack and Welleck, Sean and Weston, Jason and Kulikov, Ilia and Saha, Swarnadeep},
journal={arXiv preprint arXiv:2508.13141},
year={2025}
}
```
## License
Use of this repository and related resources are governed by OptimalThinkingBench Research License.
本数据集作为[最优思考基准测试集(OptimalOptimalThinkingBench)](https://arxiv.org/abs/2508.13141)研究项目的一部分发布。
重要提示:本数据集仅为OptimalThinkingBench的子集,不包含数学题目。如需下载完整数据集,请参阅我们的项目资料[此处](https://github.com/facebookresearch/RAM/tree/main/projects/otb)以获取更多详情。
## 使用Transformers库加载数据集
本数据集基于[Llama-4-Maverick](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct)与[Reasoning-Gym](https://github.com/open-thought/reasoning-gym)构建。关于本数据集的生成细节,请参阅[OptimalOptimalThinkingBench论文(OptimalOptimalThinkingBench)](https://arxiv.org/abs/2508.13141)。
以下展示加载该数据集的极简示例代码:
python
from datasets import load_dataset
import json
dataset = load_dataset("facebook/OptimalThinkingBench")['train']
for subset in ['overthinkingbench', 'underthinkingbench']:
for row in dataset.filter(lambda x: x['subset'] == subset):
print('Question: ', row['question'])
print('Answer: ', row['answer'])
print('Metadata: ', json.loads(row['metadata']))
print('--------------------------------')
## 引用说明
若您使用本工作中的数据或代码,请引用如下BibTex条目:
bibtex
@article{aggarwal2025otb,
title={OptimalThinkingBench: Evaluating Over and Underthinking in LLMs},
author={Aggarwal, Pranjal and Kim, Seungone and Lanchantin, Jack and Welleck, Sean and Weston, Jason and Kulikov, Ilia and Saha, Swarnadeep},
journal={arXiv preprint arXiv:2508.13141},
year={2025}
}
## 许可协议
本仓库及相关资源的使用需遵循OptimalThinkingBench研究许可协议。
提供机构:
maas
创建时间:
2025-08-27



