five

optimal_thinking_bench

收藏
魔搭社区2026-01-06 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/facebook/optimal_thinking_bench
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is released as part of [OptimalOptimalThinkingBench](https://arxiv.org/abs/2508.13141) research project. IMPORTANT: This is only a subset of OptimalThinkingBench that does not contain the math problems. To download the full dataset, please refer to our project materials [here](https://github.com/facebookresearch/RAM/tree/main/projects/otb) for more details. ## Loading the dataset with transformers This dataset is built using [Llama-4-Maverick](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct) and [Reasoning-Gym](https://github.com/open-thought/reasoning-gym). Details on how to generate this dataset can be found in [OptimalOptimalThinkingBench paper](https://arxiv.org/abs/2508.13141). Minimal example below showing how to load the dataset. ```python from datasets import load_dataset import json dataset = load_dataset("facebook/OptimalThinkingBench")['train'] for subset in ['overthinkingbench', 'underthinkingbench']: for row in dataset.filter(lambda x: x['subset'] == subset): print('Question: ', row['question']) print('Answer: ', row['answer']) print('Metadata: ', json.loads(row['metadata'])) print('--------------------------------') ``` ## Citation If you use data, or code from this work, please cite with the following BibTex entry: ```bibtex @article{aggarwal2025otb, title={OptimalThinkingBench: Evaluating Over and Underthinking in LLMs}, author={Aggarwal, Pranjal and Kim, Seungone and Lanchantin, Jack and Welleck, Sean and Weston, Jason and Kulikov, Ilia and Saha, Swarnadeep}, journal={arXiv preprint arXiv:2508.13141}, year={2025} } ``` ## License Use of this repository and related resources are governed by OptimalThinkingBench Research License.

本数据集作为[最优思考基准测试集(OptimalOptimalThinkingBench)](https://arxiv.org/abs/2508.13141)研究项目的一部分发布。 重要提示:本数据集仅为OptimalThinkingBench的子集,不包含数学题目。如需下载完整数据集,请参阅我们的项目资料[此处](https://github.com/facebookresearch/RAM/tree/main/projects/otb)以获取更多详情。 ## 使用Transformers库加载数据集 本数据集基于[Llama-4-Maverick](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct)与[Reasoning-Gym](https://github.com/open-thought/reasoning-gym)构建。关于本数据集的生成细节,请参阅[OptimalOptimalThinkingBench论文(OptimalOptimalThinkingBench)](https://arxiv.org/abs/2508.13141)。 以下展示加载该数据集的极简示例代码: python from datasets import load_dataset import json dataset = load_dataset("facebook/OptimalThinkingBench")['train'] for subset in ['overthinkingbench', 'underthinkingbench']: for row in dataset.filter(lambda x: x['subset'] == subset): print('Question: ', row['question']) print('Answer: ', row['answer']) print('Metadata: ', json.loads(row['metadata'])) print('--------------------------------') ## 引用说明 若您使用本工作中的数据或代码,请引用如下BibTex条目: bibtex @article{aggarwal2025otb, title={OptimalThinkingBench: Evaluating Over and Underthinking in LLMs}, author={Aggarwal, Pranjal and Kim, Seungone and Lanchantin, Jack and Welleck, Sean and Weston, Jason and Kulikov, Ilia and Saha, Swarnadeep}, journal={arXiv preprint arXiv:2508.13141}, year={2025} } ## 许可协议 本仓库及相关资源的使用需遵循OptimalThinkingBench研究许可协议。
提供机构:
maas
创建时间:
2025-08-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作