five

colorswap

收藏
魔搭社区2025-11-27 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/stanfordnlp/colorswap
下载链接
链接失效反馈
官方服务:
资源简介:
# ColorSwap: A Color and Word Order Dataset for Multimodal Evaluation ## Dataset Description ColorSwap is a dataset designed to assess and improve the proficiency of multimodal models in matching objects with their colors. The dataset is comprised of 2,000 unique image-caption pairs, grouped into 1,000 examples. Each example includes a caption-image pair, along with a "color-swapped" pair. Crucially, the two captions in an example have the same words, but the color words have been rearranged to modify different objects. The dataset was created through a novel blend of automated caption and image generation with humans in the loop. Paper: Coming soon! ## Usage You can download the dataset directly from the Hugging Face API with the following code: ```python from datasets import load_dataset dataset = load_dataset("stanfordnlp/colorswap", use_auth_token=True) ``` Please make sure to install the `datasets` library and use the `use_auth_token` parameter to authenticate with the Hugging Face API. An example of the dataset is as follows: ```python [ { 'id': 0, 'image_1': <PIL.PngImagePlugin.PngImageFile image mode=RGB size=1024x1024 at 0x14D908B20>, 'image_2': <PIL.PngImagePlugin.PngImageFile image mode=RGB size=1024x1024 at 0x14D9DCE20>, 'caption_1': 'someone holding a yellow umbrella wearing a white dress', 'caption_2': 'someone holding a white umbrella wearing a yellow dress', 'image_source': 'midjourney', 'caption_source': 'human' } ... ] ``` ## Evaluations [This Google Colab](https://colab.research.google.com/drive/1EWPsSklfq49WiX2nUyOTmKZftU0AC4YL?usp=sharing) showcases our ITM model evaluations. Please refer to our Github repository for the VLM evaluations: [ColorSwap](https://github.com/Top34051/colorswap). ## Citation If you find our work useful, please cite the following paper: ``` @article{burapacheep2024colorswap, author = {Jirayu Burapacheep and Ishan Gaur and Agam Bhatia and Tristan Thrush}, title = {ColorSwap: A Color and Word Order Dataset for Multimodal Evaluation}, journal = {arXiv}, year = {2024}, } ```

# ColorSwap:面向多模态评测的颜色与词序数据集 ## 数据集概述 ColorSwap是一款专为评估与提升多模态模型(multimodal model)的物体-颜色匹配能力而构建的数据集。该数据集包含2000组唯一的图像-标题对,被划分为1000个示例样本。每个示例均包含一组标准的图像-标题对,以及一组「颜色交换」变体对。尤为关键的是,单个示例内的两条标题所用词汇完全一致,仅对颜色词进行了重排,以指向不同的物体。本数据集通过自动化标题生成、自动化图像生成结合人机在环(humans in the loop)的创新方式构建。 论文:即将上线! ## 使用方法 可通过以下代码直接从Hugging Face API下载本数据集: python from datasets import load_dataset dataset = load_dataset("stanfordnlp/colorswap", use_auth_token=True) 请确保已安装`datasets`库,并使用`use_auth_token`参数完成Hugging Face API的身份验证。 数据集示例如下: python [ { 'id': 0, 'image_1': <PIL.PngImagePlugin.PngImageFile image mode=RGB size=1024x1024 at 0x14D908B20>, 'image_2': <PIL.PngImagePlugin.PngImageFile image mode=RGB size=1024x1024 at 0x14D9DCE20>, 'caption_1': 'someone holding a yellow umbrella wearing a white dress', 'caption_2': 'someone holding a white umbrella wearing a yellow dress', 'image_source': 'midjourney', 'caption_source': 'human' } ... ] ## 评测方案 [本Google Colab链接](https://colab.research.google.com/drive/1EWPsSklfq49WiX2nUyOTmKZftU0AC4YL?usp=sharing)展示了我们针对图像-文本匹配(Image-Text Matching, ITM)模型的评测流程。如需了解针对视觉语言模型(Vision-Language Model, VLM)的评测方法,请参考我们的GitHub仓库:[ColorSwap](https://github.com/Top34051/colorswap)。 ## 引用方式 若您认为本工作对您有所帮助,请引用以下论文: @article{burapacheep2024colorswap, author = {Jirayu Burapacheep and Ishan Gaur and Agam Bhatia and Tristan Thrush}, title = {ColorSwap: A Color and Word Order Dataset for Multimodal Evaluation}, journal = {arXiv}, year = {2024}, }
提供机构:
maas
创建时间:
2025-10-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作