ABC-Pretraining-Data

Name: ABC-Pretraining-Data
Creator: maas
Published: 2025-12-05 16:24:59
License: 暂无描述

魔搭社区2025-12-05 更新2025-03-01 收录

下载链接：

https://modelscope.cn/datasets/TIGER-Lab/ABC-Pretraining-Data

下载链接

链接失效反馈

官方服务：

资源简介：

## ABC Pretraining Data This dataset contains the pretraining data for ABC, an open-source multimodal embedding model that uses a vision-language model backbone to deeply integrate image features with natural language instructions, advancing the state of visual embeddings with natural language control. This dataset is derived from Google's [Conceptual Captions](https://ai.google.com/research/ConceptualCaptions/) dataset. Each item in the dataset contains a URL where the corresponding image can be downloaded and mined negatives for each item. The full dataset is ~300 GB of images. For a detailed description of how we mined the negatives, please check out our paper. **Update**: The images have been added to this repository. For an example of how to use and download this dataset, see our [repository](https://github.com/TIGER-AI-Lab/ABC). ## Paper, Project Page, and Code - Paper: [ABC: Achieving Better Control of Multimodal Embeddings using VLMs](https://huggingface.co/papers/2503.00329) - Project Page: [https://tiger-ai-lab.github.io/ABC/](https://tiger-ai-lab.github.io/ABC/) - Code: [https://github.com/TIGER-AI-Lab/ABC](https://github.com/TIGER-AI-Lab/ABC) ## Sample Usage ### Quick Start First, install the necessary dependencies by cloning the repository and installing requirements: ```bash git clone https://github.com/TIGER-AI-Lab/ABC cd ABC pip install -r requirements.txt ``` Then, you can start making multimodal embeddings: ```python python -i ./quick_start.py ``` ### Fetching Datasets from 🤗 Hub Our datasets are hosted on HuggingFace Hub. The text data and dataset metadata can be fetched using HF's `load_dataset` utility. To fetch the images from our datasets, we provide scripts in the `fetch_datasets` directory. These scripts will pull the pretraining/finetuning image data off the hub and unpack them in your huggingface datasets cache (under a directory called `tigerlab`). Run `python ./fetch_datasets/pretrain.py` to get the pretraining dataset and `python ./fetch_datasets/instruct.py` to get the finetuning dataset, respectively. ## Citation If you find any of our work helpful, please consider citing: ```bibtex @misc{schneider2025abcachievingbettercontrol, title={ABC: Achieving Better Control of Multimodal Embeddings using VLMs}, author={Benjamin Schneider and Florian Kerschbaum and Wenhu Chen}, year={2025}, eprint={2503.00329}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2503.00329}, } ```

# ABC预训练数据集本数据集为ABC模型的预训练数据。ABC是一款开源多模态嵌入模型，采用视觉语言模型（Vision-Language Model, VLM）骨干网络实现图像特征与自然语言指令的深度融合，借助自然语言控制推动视觉嵌入技术的前沿性能。本数据集源自谷歌发布的[Conceptual Captions](https://ai.google.com/research/ConceptualCaptions/)数据集。数据集中的每一条目均包含可下载对应图像的URL，同时为每条数据挖掘负样本。完整数据集的图像数据总量约为300 GB。关于负样本挖掘的具体细节，请参阅我们的研究论文。 **更新**：本数据集的图像已上传至代码仓库。如需了解数据集使用与下载的示例，请参阅我们的[代码仓库](https://github.com/TIGER-AI-Lab/ABC)。 ## 论文、项目主页与代码 - 论文：[ABC：借助视觉语言模型实现多模态嵌入的更优控制](https://huggingface.co/papers/2503.00329) - 项目主页：[https://tiger-ai-lab.github.io/ABC/](https://tiger-ai-lab.github.io/ABC/) - 代码：[https://github.com/TIGER-AI-Lab/ABC](https://github.com/TIGER-AI-Lab/ABC) ## 示例使用 ### 快速上手首先通过克隆代码仓库并安装依赖包配置运行环境： bash git clone https://github.com/TIGER-AI-Lab/ABC cd ABC pip install -r requirements.txt 随后即可开始生成多模态嵌入向量： python python -i ./quick_start.py ### 从🤗 Hub获取数据集我们的数据集托管于HuggingFace Hub。可通过HuggingFace（HF）的`load_dataset`工具获取文本数据与数据集元数据。针对数据集图像的获取，我们在`fetch_datasets`目录中提供了专用脚本。这些脚本将从Hub拉取预训练/微调图像数据，并将其解压至你的HuggingFace数据集缓存目录（路径为`tigerlab`子文件夹）。分别运行`python ./fetch_datasets/pretrain.py`以获取预训练数据集，运行`python ./fetch_datasets/instruct.py`以获取微调数据集。 ## 引用格式若您的研究工作得益于本项目，请引用以下文献： bibtex @misc{schneider2025abcachievingbettercontrol, title={ABC: Achieving Better Control of Multimodal Embeddings using VLMs}, author={Benjamin Schneider and Florian Kerschbaum and Wenhu Chen}, year={2025}, eprint={2503.00329}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2503.00329}, }

提供机构：

maas

创建时间：

2025-02-27

5,000+

优质数据集

54 个

任务类型

进入经典数据集