optimal_thinking_bench

Name: optimal_thinking_bench
Creator: maas
Published: 2026-01-06 16:44:31
License: 暂无描述

魔搭社区2026-01-06 更新2025-12-06 收录

下载链接：

https://modelscope.cn/datasets/facebook/optimal_thinking_bench

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset is released as part of [OptimalOptimalThinkingBench](https://arxiv.org/abs/2508.13141) research project. IMPORTANT: This is only a subset of OptimalThinkingBench that does not contain the math problems. To download the full dataset, please refer to our project materials [here](https://github.com/facebookresearch/RAM/tree/main/projects/otb) for more details. ## Loading the dataset with transformers This dataset is built using [Llama-4-Maverick](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct) and [Reasoning-Gym](https://github.com/open-thought/reasoning-gym). Details on how to generate this dataset can be found in [OptimalOptimalThinkingBench paper](https://arxiv.org/abs/2508.13141). Minimal example below showing how to load the dataset. ```python from datasets import load_dataset import json dataset = load_dataset("facebook/OptimalThinkingBench")['train'] for subset in ['overthinkingbench', 'underthinkingbench']: for row in dataset.filter(lambda x: x['subset'] == subset): print('Question: ', row['question']) print('Answer: ', row['answer']) print('Metadata: ', json.loads(row['metadata'])) print('--------------------------------') ``` ## Citation If you use data, or code from this work, please cite with the following BibTex entry: ```bibtex @article{aggarwal2025otb, title={OptimalThinkingBench: Evaluating Over and Underthinking in LLMs}, author={Aggarwal, Pranjal and Kim, Seungone and Lanchantin, Jack and Welleck, Sean and Weston, Jason and Kulikov, Ilia and Saha, Swarnadeep}, journal={arXiv preprint arXiv:2508.13141}, year={2025} } ``` ## License Use of this repository and related resources are governed by OptimalThinkingBench Research License.

本数据集作为[最优思考基准测试集（OptimalOptimalThinkingBench）](https://arxiv.org/abs/2508.13141)研究项目的一部分发布。重要提示：本数据集仅为OptimalThinkingBench的子集，不包含数学题目。如需下载完整数据集，请参阅我们的项目资料[此处](https://github.com/facebookresearch/RAM/tree/main/projects/otb)以获取更多详情。 ## 使用Transformers库加载数据集本数据集基于[Llama-4-Maverick](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct)与[Reasoning-Gym](https://github.com/open-thought/reasoning-gym)构建。关于本数据集的生成细节，请参阅[OptimalOptimalThinkingBench论文（OptimalOptimalThinkingBench）](https://arxiv.org/abs/2508.13141)。以下展示加载该数据集的极简示例代码： python from datasets import load_dataset import json dataset = load_dataset("facebook/OptimalThinkingBench")['train'] for subset in ['overthinkingbench', 'underthinkingbench']: for row in dataset.filter(lambda x: x['subset'] == subset): print('Question: ', row['question']) print('Answer: ', row['answer']) print('Metadata: ', json.loads(row['metadata'])) print('--------------------------------') ## 引用说明若您使用本工作中的数据或代码，请引用如下BibTex条目： bibtex @article{aggarwal2025otb, title={OptimalThinkingBench: Evaluating Over and Underthinking in LLMs}, author={Aggarwal, Pranjal and Kim, Seungone and Lanchantin, Jack and Welleck, Sean and Weston, Jason and Kulikov, Ilia and Saha, Swarnadeep}, journal={arXiv preprint arXiv:2508.13141}, year={2025} } ## 许可协议本仓库及相关资源的使用需遵循OptimalThinkingBench研究许可协议。

提供机构：

maas

创建时间：

2025-08-27

5,000+

优质数据集

54 个

任务类型

进入经典数据集