PaintSkills
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/j-min/DallEval
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个诊断性数据集和评估工具包,旨在评估文本到图像生成模型的视觉推理能力,涵盖了物体识别、物体计数、颜色识别以及空间关系理解等多个方面。该数据集通过控制物体分布和输入文本,解决了视觉问答数据集中的统计偏差问题。在物体识别、物体计数、颜色识别以及空间关系理解技能的训练和验证划分中,分别包含了21,000、25,200、25,200和28,244个样本。其任务是对文本到图像生成模型的视觉推理技能进行评估。
This dataset is a diagnostic dataset and evaluation toolkit designed to assess the visual reasoning capabilities of text-to-image generation models, covering multiple aspects including object recognition, object counting, color recognition, and spatial relationship understanding. It addresses the statistical bias issues in visual question answering (VQA) datasets by controlling object distributions and input texts. For the training and validation splits targeting the skills of object recognition, object counting, color recognition, and spatial relationship understanding, the dataset contains 21,000, 25,200, 25,200, and 28,244 samples respectively. Its core task is to evaluate the visual reasoning skills of text-to-image generation models.



