OpenDataArena/MMFineReason
收藏Hugging Face2025-12-23 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/OpenDataArena/MMFineReason
下载链接
链接失效反馈官方服务:
资源简介:
FineReason是一个多模态推理数据集,旨在增强大型多模态模型(LMMs)在视觉推理方面的能力,涵盖STEM(科学、技术、工程和数学)、视觉谜题、游戏和复杂图表推理。每个示例包含从Qwen3-VL-235B-a22B-thinking中提取的推理式答案,促进长链、可解释的多模态推理。该数据集通过整理和提炼高质量推理数据集,解决了现有数据集中数据不平衡和推理质量受限的问题。数据集持续扩展,包含多个子数据集,每个子数据集有不同的示例数量。每个条目包含ID、问题、图像和推理式答案。数据生成过程涉及从Qwen3-VL-235B-a22B-thinking中提取长链答案,以确保跨数据集的一致性推理痕迹。该数据集是OpenDataArena平台的一部分,该平台专注于发现、评估和推进用于AI后训练的高质量数据集。
FineReason is a multimodal reasoning dataset designed to enhance large multimodal models (LMMs) in visual reasoning, covering STEM (Science, Technology, Engineering, and Mathematics), visual puzzles, games, and complex diagram reasoning. Each example includes a reasoning-style answer distilled from Qwen3-VL-235B-a22B-thinking, promoting long-chain, interpretable multimodal reasoning. The dataset addresses data imbalance and reasoning quality constraints in existing datasets by curating and distilling high-quality reasoning datasets with a consistent reasoning style. The dataset is continuously expanding and includes various sub-datasets with different counts of examples. Each entry contains an ID, question, image, and a reasoning-style answer. The data generation process involves distilling long-chain answers from Qwen3-VL-235B-a22B-thinking to ensure consistent reasoning traces across datasets. The dataset is part of the OpenDataArena platform, which focuses on discovering, evaluating, and advancing high-quality datasets for AI post-training.
提供机构:
OpenDataArena



