five

CountBenchQA

收藏
魔搭社区2025-12-26 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/vikhyatk/CountBenchQA
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset was introduced in PaliGemma for evaluating counting in vision language models. This version only includes 491 images from the original CountBench dataset, since some of the original URLs can no longer be accessed. ### Original Description * CountBench: We introduce a new object counting benchmark called CountBench, automatically curated (and manually verified) from the publicly available LAION-400M image-text dataset. CountBench contains a total of 540 images containing between two and ten instances of a particular object, where their corresponding captions reflect this number. * CountBenchQA: Each image is paired with a manually generated question about the number of objects in the image to turn CountBench into a VQA task. ``` @article{beyer2024paligemma, title={{PaliGemma: A versatile 3B VLM for transfer}}, author={Lucas Beyer and Andreas Steiner and André Susano Pinto and Alexander Kolesnikov and Xiao Wang and Daniel Salz and Maxim Neumann and Ibrahim Alabdulmohsin and Michael Tschannen and Emanuele Bugliarello and Thomas Unterthiner and Daniel Keysers and Skanda Koppula and Fangyu Liu and Adam Grycner and Alexey Gritsenko and Neil Houlsby and Manoj Kumar and Keran Rong and Julian Eisenschlos and Rishabh Kabra and Matthias Bauer and Matko Bošnjak and Xi Chen and Matthias Minderer and Paul Voigtlaender and Ioana Bica and Ivana Balazevic and Joan Puigcerver and Pinelopi Papalampidi and Olivier Henaff and Xi Xiong and Radu Soricut and Jeremiah Harmsen and Xiaohua Zhai}, year={2024}, journal={arXiv preprint arXiv:2407.07726} } @article{paiss2023countclip, title={{Teaching CLIP to Count to Ten}}, author={Paiss, Roni and Ephrat, Ariel and Tov, Omer and Zada, Shiran and Mosseri, Inbar and Irani, Michal and Dekel, Tali}, year={2023}, journal={arXiv preprint arXiv:2302.12066} } ```

本数据集由PaliGemma提出,用于评估视觉语言模型(Vision-Language Model)的计数能力。由于原始CountBench数据集的部分链接已无法访问,本版本仅保留了原数据集中的491张图像。 ### 原始描述 * CountBench:本文提出了一款名为CountBench的全新目标计数基准数据集,其从公开可用的LAION-400M图文数据集中自动筛选(并经人工验证)得到。CountBench共包含540张图像,每张图像中包含2至10个特定目标实例,且对应的图像标题准确反映了该目标的数量。 * CountBenchQA:为将CountBench转化为视觉问答(Visual Question Answering, VQA)任务,每张图像均搭配了人工生成的、针对图像中目标数量的提问。 @article{beyer2024paligemma, title={{PaliGemma: A versatile 3B VLM for transfer}}, author={Lucas Beyer and Andreas Steiner and André Susano Pinto and Alexander Kolesnikov and Xiao Wang and Daniel Salz and Maxim Neumann and Ibrahim Alabdulmohsin and Michael Tschannen and Emanuele Bugliarello and Thomas Unterthiner and Daniel Keysers and Skanda Koppula and Fangyu Liu and Adam Grycner and Alexey Gritsenko and Neil Houlsby and Manoj Kumar and Keran Rong and Julian Eisenschlos and Rishabh Kabra and Matthias Bauer and Matko Bošnjak and Xi Chen and Matthias Minderer and Paul Voigtlaender and Ioana Bica and Ivana Balazevic and Joan Puigcerver and Pinelopi Papalampidi and Olivier Henaff and Xi Xiong and Radu Soricut and Jeremiah Harmsen and Xiaohua Zhai}, year={2024}, journal={arXiv preprint arXiv:2407.07726} } @article{paiss2023countclip, title={{Teaching CLIP to Count to Ten}}, author={Paiss, Roni and Ephrat, Ariel and Tov, Omer and Zada, Shiran and Mosseri, Inbar and Irani, Michal and Dekel, Tali}, year={2023}, journal={arXiv preprint arXiv:2302.12066} }
提供机构:
maas
创建时间:
2025-09-04
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
CountBenchQA是一个用于评估视觉语言模型计数能力的视觉问答数据集,包含491张带有物体数量问题的图片,这些图片是从原始CountBench数据集中筛选出的可访问子集。该数据集将物体计数任务转化为VQA任务,主要用于PaliGemma模型的评估。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作