ABC-Pretraining-Data
收藏魔搭社区2025-12-05 更新2025-03-01 收录
下载链接:
https://modelscope.cn/datasets/TIGER-Lab/ABC-Pretraining-Data
下载链接
链接失效反馈官方服务:
资源简介:
## ABC Pretraining Data
This dataset contains the pretraining data for ABC, an open-source multimodal embedding model that uses a vision-language model backbone to deeply integrate image features with natural language instructions, advancing the state of visual embeddings with natural language control.
This dataset is derived from Google's [Conceptual Captions](https://ai.google.com/research/ConceptualCaptions/) dataset.
Each item in the dataset contains a URL where the corresponding image can be downloaded and mined negatives for each item. The full dataset is ~300 GB of images. For a detailed description of how we mined the negatives, please check out our paper.
**Update**: The images have been added to this repository. For an example of how to use and download this dataset, see our [repository](https://github.com/TIGER-AI-Lab/ABC).
## Paper, Project Page, and Code
- Paper: [ABC: Achieving Better Control of Multimodal Embeddings using VLMs](https://huggingface.co/papers/2503.00329)
- Project Page: [https://tiger-ai-lab.github.io/ABC/](https://tiger-ai-lab.github.io/ABC/)
- Code: [https://github.com/TIGER-AI-Lab/ABC](https://github.com/TIGER-AI-Lab/ABC)
## Sample Usage
### Quick Start
First, install the necessary dependencies by cloning the repository and installing requirements:
```bash
git clone https://github.com/TIGER-AI-Lab/ABC
cd ABC
pip install -r requirements.txt
```
Then, you can start making multimodal embeddings:
```python
python -i ./quick_start.py
```
### Fetching Datasets from 🤗 Hub
Our datasets are hosted on HuggingFace Hub. The text data and dataset metadata can be fetched using HF's `load_dataset` utility.
To fetch the images from our datasets, we provide scripts in the `fetch_datasets` directory.
These scripts will pull the pretraining/finetuning image data off the hub and unpack them in your huggingface datasets cache (under a directory called `tigerlab`).
Run `python ./fetch_datasets/pretrain.py` to get the pretraining dataset and `python ./fetch_datasets/instruct.py` to get the finetuning dataset, respectively.
## Citation
If you find any of our work helpful, please consider citing:
```bibtex
@misc{schneider2025abcachievingbettercontrol,
title={ABC: Achieving Better Control of Multimodal Embeddings using VLMs},
author={Benjamin Schneider and Florian Kerschbaum and Wenhu Chen},
year={2025},
eprint={2503.00329},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.00329},
}
```
# ABC预训练数据集
本数据集为ABC模型的预训练数据。ABC是一款开源多模态嵌入模型,采用视觉语言模型(Vision-Language Model, VLM)骨干网络实现图像特征与自然语言指令的深度融合,借助自然语言控制推动视觉嵌入技术的前沿性能。
本数据集源自谷歌发布的[Conceptual Captions](https://ai.google.com/research/ConceptualCaptions/)数据集。数据集中的每一条目均包含可下载对应图像的URL,同时为每条数据挖掘负样本。完整数据集的图像数据总量约为300 GB。关于负样本挖掘的具体细节,请参阅我们的研究论文。
**更新**:本数据集的图像已上传至代码仓库。如需了解数据集使用与下载的示例,请参阅我们的[代码仓库](https://github.com/TIGER-AI-Lab/ABC)。
## 论文、项目主页与代码
- 论文:[ABC:借助视觉语言模型实现多模态嵌入的更优控制](https://huggingface.co/papers/2503.00329)
- 项目主页:[https://tiger-ai-lab.github.io/ABC/](https://tiger-ai-lab.github.io/ABC/)
- 代码:[https://github.com/TIGER-AI-Lab/ABC](https://github.com/TIGER-AI-Lab/ABC)
## 示例使用
### 快速上手
首先通过克隆代码仓库并安装依赖包配置运行环境:
bash
git clone https://github.com/TIGER-AI-Lab/ABC
cd ABC
pip install -r requirements.txt
随后即可开始生成多模态嵌入向量:
python
python -i ./quick_start.py
### 从🤗 Hub获取数据集
我们的数据集托管于HuggingFace Hub。可通过HuggingFace(HF)的`load_dataset`工具获取文本数据与数据集元数据。针对数据集图像的获取,我们在`fetch_datasets`目录中提供了专用脚本。这些脚本将从Hub拉取预训练/微调图像数据,并将其解压至你的HuggingFace数据集缓存目录(路径为`tigerlab`子文件夹)。分别运行`python ./fetch_datasets/pretrain.py`以获取预训练数据集,运行`python ./fetch_datasets/instruct.py`以获取微调数据集。
## 引用格式
若您的研究工作得益于本项目,请引用以下文献:
bibtex
@misc{schneider2025abcachievingbettercontrol,
title={ABC: Achieving Better Control of Multimodal Embeddings using VLMs},
author={Benjamin Schneider and Florian Kerschbaum and Wenhu Chen},
year={2025},
eprint={2503.00329},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.00329},
}
提供机构:
maas
创建时间:
2025-02-27



