five

OpenMOSS-Team/VideoThinkBench

收藏
Hugging Face2026-03-13 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/OpenMOSS-Team/VideoThinkBench
下载链接
链接失效反馈
官方服务:
资源简介:
VideoThinkBench是一个全面的基准数据集,用于评估视频生成模型的推理能力。它包含两大类任务:视觉中心任务和文本中心任务。视觉中心任务包括Eyeballing Puzzles(空间推理任务)、Visual Puzzles(模式识别和视觉逻辑问题)、ARC-AGI-2(抽象推理任务)和Mazes(路径寻找和导航挑战)。文本中心任务则改编自已有的基准数据集,如MATH、GSM8K、MMLU、MMMU等,涵盖数学推理、多模态理解、通用知识和科学推理等领域。数据集旨在通过视频生成模型实现视觉和文本推理的统一,克服传统图像和文本推理的局限性。

VideoThinkBench is a comprehensive benchmark dataset designed to evaluate the reasoning capabilities of video generation models. It consists of two main categories of tasks: vision-centric tasks and text-centric tasks. Vision-centric tasks include Eyeballing Puzzles (spatial reasoning tasks), Visual Puzzles (pattern recognition and visual logic problems), ARC-AGI-2 (abstract reasoning tasks), and Mazes (path-finding and navigation challenges). Text-centric tasks are adapted from established benchmarks such as MATH, GSM8K, MMLU, and MMMU, covering areas like mathematical reasoning, multimodal understanding, general knowledge, and scientific reasoning. The dataset aims to unify visual and textual reasoning through video generation models, overcoming the limitations of traditional image and text-based reasoning paradigms.
提供机构:
OpenMOSS-Team
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作