CREPE
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/velocitycavalry/crepe
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为CREPE,是一个专门为评估视觉-语言模型在不同场景下的组合推理能力而设计的基准数据集。在此数据集中,报告了召回率@1的准确性指标。该数据集包含了52,189张图片和129,558条参考句子,其任务是评估组合推理的能力。
This dataset, named CREPE, is a benchmark specifically designed to evaluate the compositional reasoning capabilities of vision-language models across diverse scenarios. The Recall@1 accuracy metric is reported for this benchmark. It comprises 52,189 images and 129,558 reference sentences, with its core task being the assessment of compositional reasoning abilities.



