GCC-15M
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/microsoft/UniCL
下载链接
链接失效反馈官方服务:
资源简介:
该数据集融合了GCC-3M和GCC-12M,包含了大量的图像-文本对。此外,它被用于在提出的UniCL框架中进行图像-文本-标签的联合训练。该数据集的规模达到了1500万图像-文本对,旨在应对图像-文本分类和检索任务。
This dataset integrates GCC-3M and GCC-12M, encompassing a vast collection of image-text pairs. Furthermore, it is employed for joint image-text-label training in the proposed UniCL framework. With a total of 15 million image-text pairs, this dataset is designed to tackle image-text classification and retrieval tasks.
提供机构:
GCC



