Towards Video Thinking Test: A Holistic Benchmark for Advanced Video Reasoning and Understanding
收藏DataCite Commons2025-10-09 更新2026-05-04 收录
下载链接:
https://researchdata.ntu.edu.sg/citation?persistentId=doi:10.21979/N9/KTBVSQ
下载链接
链接失效反馈官方服务:
资源简介:
We introduce the Video Thinking Test (Video-TT), a benchmark designed to assess if video LLMs can interpret real-world videos as effectively as humans. Video-TT 1) differentiates between errors due to inadequate frame sampling and genuine gaps in understanding complex visual narratives, and 2) evaluates robustness against natural adversarial questions. Video-TT comprises 1,000 YouTube Shorts videos, each with one open-ended question and four adversarial questions that probe visual and narrative complexity. Our evaluation shows a significant gap between video LLMs and human performance, underscoring the need for benchmarks like Video-TT to advance video understanding.
提供机构:
DR-NTU (Data)
创建时间:
2025-09-17



