CREPE

arXiv2025-09-30 收录

下载链接：

https://github.com/velocitycavalry/crepe

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为CREPE，是一个专门为评估视觉-语言模型在不同场景下的组合推理能力而设计的基准数据集。在此数据集中，报告了召回率@1的准确性指标。该数据集包含了52,189张图片和129,558条参考句子，其任务是评估组合推理的能力。

This dataset, named CREPE, is a benchmark specifically designed to evaluate the compositional reasoning capabilities of vision-language models across diverse scenarios. The Recall@1 accuracy metric is reported for this benchmark. It comprises 52,189 images and 129,558 reference sentences, with its core task being the assessment of compositional reasoning abilities.

5,000+

优质数据集

54 个

任务类型

进入经典数据集