Voxel51/ColorSwap
收藏Hugging Face2024-07-04 更新2024-07-06 收录
下载链接:
https://hf-mirror.com/datasets/Voxel51/ColorSwap
下载链接
链接失效反馈官方服务:
资源简介:
ColorSwap数据集旨在评估和改进多模态模型在匹配对象与其颜色方面的能力。它包含2000个独特的图像-标题对,分为1000个示例。每个示例包括一个标题-图像对和一个“颜色交换”对,遵循Winoground模式,其中两个标题具有相同的单词,但颜色单词重新排列以修改不同的对象。数据集通过自动生成和人工输入相结合的方式创建,使用了多种扩散模型进行图像生成。评估显示,即使是先进的多模态模型在此任务上也存在困难,但通过在ColorSwap训练集上进行微调,可以显著提高性能。数据集为评估和改进多模态模型的组合颜色理解能力提供了一个有针对性的基准。
The ColorSwap dataset is a benchmark designed to evaluate and improve the ability of multimodal models to match objects with their colors. It contains 2,000 unique image-caption pairs, grouped into 1,000 examples. Each example includes a caption-image pair and a color-swapped pair, following the Winoground schema where the two captions have the same words but with color words rearranged to modify different objects. The dataset was created using a combination of automated caption and image generation, along with human input to ensure naturalness and accuracy. Various diffusion models like Stable Diffusion, Midjourney, and DALLE 3 were used for image generation. Evaluations reveal that even state-of-the-art models struggle with this task, but fine-tuning on the ColorSwap training set yields significant performance gains. The dataset provides a targeted benchmark for evaluating and improving the compositional color comprehension abilities of multimodal models.
提供机构:
Voxel51



