minimal_video_pairs

Name: minimal_video_pairs
Creator: maas
Published: 2025-12-05 16:38:18
License: 暂无描述

魔搭社区2025-12-05 更新2025-06-14 收录

下载链接：

https://modelscope.cn/datasets/facebook/minimal_video_pairs

下载链接

链接失效反馈

官方服务：

资源简介：

# Minimal Video Pairs A shortcut-aware benchmark for spatio-temporal and intuitive physics video understanding (VideoQA) using minimally different video pairs. - [Github](https://github.com/facebookresearch/minimal_video_pairs) <p align="center"> <img src="overview.png" width="75%" alt="Overview"/> </p> For legal reasons, we are unable to upload the videos directly to Huggingface. However, we provide scripts in this repository for downloading the videos in our github repository. Our benchmark is built on top of videos source from 9 domains: | Subset | Data sources | | --- | --- | | Human object interactions | [PerceptionTest](https://github.com/google-deepmind/perception_test), [SomethingSomethingV2](https://www.qualcomm.com/developer/software/something-something-v-2-dataset) | | Robot object interactions | [Language Table](https://github.com/google-research/language-table) | | Intuitive Physics and collisions | [IntPhys](https://intphys.cognitive-ml.fr/), [InfLevel](https://github.com/allenai/inflevel), [GRASP](https://github.com/i-machine-think/grasp), [CLEVRER](http://clevrer.csail.mit.edu/) | | Temporal Reasoning | [STAR](https://bobbywu.com/STAR/), [Vinoground](https://vinoground.github.io/) | ## Run evaluation To enable reproducible evaluation, we utilize the [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval) library. We have provided the task files you need to run `mvp` and `mvp_mini`. `mvp_mini` is essentially a smaller, balanced evaluation set with 9k examples for enabling faster evaluations. Please follow the instructions on our [Github repository](https://github.com/facebookresearch/minimal_video_pairs) for running reproducible evals. ## Leaderboard We also release our [Physical Reasoning Leaderboard](https://huggingface.co/spaces/facebook/physical_reasoning_leaderboard), where you can submit your model outputs for this dataset. Use the `data_name` as `mvp` and `mvp_mini` for submissions against `full` and `mini` split respectvely. ## Citation and acknowledgements We are grateful to the datasets listed above for utilizing their videos to create this benchmark. Please cite us if you use the benchmark or any other part of the paper: ```bibtex @article{krojer2025shortcut, title={A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs} author={Benno Krojer and Mojtaba Komeili and Candace Ross and Quentin Garrido and Koustuv Sinha and Nicolas Ballas and Mahmoud Assran}, journal={arXiv}, year={2025} } ```

# 极简视频对（Minimal Video Pairs）一款采用差异极小的视频对构建的、面向时空与直觉物理视频理解任务（视频问答（VideoQA））的捷径感知（shortcut-aware）基准测试集。 - [GitHub仓库](https://github.com/facebookresearch/minimal_video_pairs) <p align="center"> <img src="overview.png" width="75%" alt="概览"/> </p> 由于法律限制，我们无法直接将视频上传至Hugging Face平台。不过我们在本GitHub仓库中提供了视频下载脚本。本基准测试集的视频来源于9个领域： | 子集分类 | 数据来源 | | --- | --- | | 人类-物体交互 | [PerceptionTest](https://github.com/google-deepmind/perception_test), [SomethingSomethingV2](https://www.qualcomm.com/developer/software/something-something-v-2-dataset) | | 机器人-物体交互 | [Language Table](https://github.com/google-research/language-table) | | 直觉物理与碰撞 | [IntPhys](https://intphys.cognitive-ml.fr/), [InfLevel](https://github.com/allenai/inflevel), [GRASP](https://github.com/i-machine-think/grasp), [CLEVRER](http://clevrer.csail.mit.edu/) | | 时空推理 | [STAR](https://bobbywu.com/STAR/), [Vinoground](https://vinoground.github.io/) | ## 评估运行为实现可复现的评估，我们采用了[lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval)工具库。我们已提供运行`mvp`与`mvp_mini`所需的任务配置文件。其中`mvp_mini`是经过均衡处理的小型评估集，仅包含9000个样本，可加速评估流程。如需执行可复现的评估，请遵循我们[GitHub仓库](https://github.com/facebookresearch/minimal_video_pairs)中的说明进行操作。 ## 排行榜我们还推出了[物理推理排行榜](https://huggingface.co/spaces/facebook/physical_reasoning_leaderboard)，您可在此提交针对本数据集的模型输出结果。提交时请分别将`data_name`设为`mvp`与`mvp_mini`，以对应完整集（full）与迷你集（mini）的拆分任务。 ## 引用与致谢我们感谢上述数据集提供的视频资源，用于构建本基准测试集。若您在研究中使用本基准测试集或论文的相关内容，请引用我们的工作： bibtex @article{krojer2025shortcut, title={A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs} author={Benno Krojer and Mojtaba Komeili and Candace Ross and Quentin Garrido and Koustuv Sinha and Nicolas Ballas and Mahmoud Assran}, journal={arXiv}, year={2025} }

提供机构：

maas

创建时间：

2025-06-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集