CountBenchQA

Name: CountBenchQA
Creator: maas
Published: 2025-12-26 16:47:52
License: 暂无描述

魔搭社区2025-12-26 更新2025-11-03 收录

下载链接：

https://modelscope.cn/datasets/vikhyatk/CountBenchQA

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset was introduced in PaliGemma for evaluating counting in vision language models. This version only includes 491 images from the original CountBench dataset, since some of the original URLs can no longer be accessed. ### Original Description * CountBench: We introduce a new object counting benchmark called CountBench, automatically curated (and manually verified) from the publicly available LAION-400M image-text dataset. CountBench contains a total of 540 images containing between two and ten instances of a particular object, where their corresponding captions reflect this number. * CountBenchQA: Each image is paired with a manually generated question about the number of objects in the image to turn CountBench into a VQA task. ``` @article{beyer2024paligemma, title={{PaliGemma: A versatile 3B VLM for transfer}}, author={Lucas Beyer and Andreas Steiner and André Susano Pinto and Alexander Kolesnikov and Xiao Wang and Daniel Salz and Maxim Neumann and Ibrahim Alabdulmohsin and Michael Tschannen and Emanuele Bugliarello and Thomas Unterthiner and Daniel Keysers and Skanda Koppula and Fangyu Liu and Adam Grycner and Alexey Gritsenko and Neil Houlsby and Manoj Kumar and Keran Rong and Julian Eisenschlos and Rishabh Kabra and Matthias Bauer and Matko Bošnjak and Xi Chen and Matthias Minderer and Paul Voigtlaender and Ioana Bica and Ivana Balazevic and Joan Puigcerver and Pinelopi Papalampidi and Olivier Henaff and Xi Xiong and Radu Soricut and Jeremiah Harmsen and Xiaohua Zhai}, year={2024}, journal={arXiv preprint arXiv:2407.07726} } @article{paiss2023countclip, title={{Teaching CLIP to Count to Ten}}, author={Paiss, Roni and Ephrat, Ariel and Tov, Omer and Zada, Shiran and Mosseri, Inbar and Irani, Michal and Dekel, Tali}, year={2023}, journal={arXiv preprint arXiv:2302.12066} } ```

本数据集由PaliGemma提出，用于评估视觉语言模型（Vision-Language Model）的计数能力。由于原始CountBench数据集的部分链接已无法访问，本版本仅保留了原数据集中的491张图像。 ### 原始描述 * CountBench：本文提出了一款名为CountBench的全新目标计数基准数据集，其从公开可用的LAION-400M图文数据集中自动筛选（并经人工验证）得到。CountBench共包含540张图像，每张图像中包含2至10个特定目标实例，且对应的图像标题准确反映了该目标的数量。 * CountBenchQA：为将CountBench转化为视觉问答（Visual Question Answering, VQA）任务，每张图像均搭配了人工生成的、针对图像中目标数量的提问。 @article{beyer2024paligemma, title={{PaliGemma: A versatile 3B VLM for transfer}}, author={Lucas Beyer and Andreas Steiner and André Susano Pinto and Alexander Kolesnikov and Xiao Wang and Daniel Salz and Maxim Neumann and Ibrahim Alabdulmohsin and Michael Tschannen and Emanuele Bugliarello and Thomas Unterthiner and Daniel Keysers and Skanda Koppula and Fangyu Liu and Adam Grycner and Alexey Gritsenko and Neil Houlsby and Manoj Kumar and Keran Rong and Julian Eisenschlos and Rishabh Kabra and Matthias Bauer and Matko Bošnjak and Xi Chen and Matthias Minderer and Paul Voigtlaender and Ioana Bica and Ivana Balazevic and Joan Puigcerver and Pinelopi Papalampidi and Olivier Henaff and Xi Xiong and Radu Soricut and Jeremiah Harmsen and Xiaohua Zhai}, year={2024}, journal={arXiv preprint arXiv:2407.07726} } @article{paiss2023countclip, title={{Teaching CLIP to Count to Ten}}, author={Paiss, Roni and Ephrat, Ariel and Tov, Omer and Zada, Shiran and Mosseri, Inbar and Irani, Michal and Dekel, Tali}, year={2023}, journal={arXiv preprint arXiv:2302.12066} }

提供机构：

maas

创建时间：

2025-09-04

搜集汇总

数据集介绍