ahnpersie/coco-deceptive-clip-llama3.1-8b
收藏Hugging Face2025-12-08 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/ahnpersie/coco-deceptive-clip-llama3.1-8b
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
task_categories:
- text-generation
language:
- en
tags:
- LLM
- large language model
- adversarial captioning
- vision-language compositionality
size_categories:
- 100K<n<1M
---
# COCO-Deceptive-CLIP-LLaMA-3.1-8B Training Dataset
> 🏆 **This work is accepted to ACL 2025 (Main Conference).**
<p align="left">
<img src="./main_result.png" alt="main result" width="60%" height="60%">
<em>Figure: Attack success rate (ASR) and caption diversity of our model on the COCO dataset, illustrating its ability to generate deceptive captions that successfully fool CLIP.</em>
</p>
## Dataset Description
- **Repository:** [Code](https://github.com/ahnjaewoo/MAC)
- **Paper:** [Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates](https://arxiv.org/abs/2505.22943)
- **Point of Contact:** [Jaewoo Ahn](mailto:jaewoo.ahn@vision.snu.ac.kr), [Heeseung Yun](mailto:heeseung.yun@vision.snu.ac.kr)
## Dataset Details
This dataset provides **instruction–response pairs** formatted as short two-turn conversations:
* The **user message** contains:
* A given image caption.
* A set of **task instructions** defining the deceptive caption generation rules.
* The requirement to output a **Generated Caption:** that contradicts the original caption while remaining semantically close enough to fool CLIP.
* The **assistant message** contains:
* A single line that begins with
```text
Generated Caption: ...
```
which serves as the synthesized “deceptive (or adversarial)" caption.
Each dataset instance follows the structure:
```json
[
{
"content": "<deceptive caption generation instructions + given caption>",
"role": "user"
},
{
"content": "Generated Caption: <model-generated deceptive caption>",
"role": "assistant"
}
]
```
This conversational schema is optimized for fine-tuning instruction-following models to produce **deceptive captions** that increase CLIP similarity while contradicting the ground-truth semantics with minimal word-level edits.
---
### Relation to the Fine-Tuned Model
This dataset was used to fine-tune
👉 **[ahnpersie/llama3.1-8b-lora-coco-deceptive-clip](https://huggingface.co/ahnpersie/llama3.1-8b-lora-coco-deceptive-clip)**,
a LoRA-adapted version of **LLaMA-3.1-8B** that learns to generate deceptive captions capable of misleading CLIP.
The released model demonstrates:
* effective adversarial caption generation,
* a strong **attack success rate (ASR) & attack diversity (H)** on COCO,
* and improved compositional deception behavior originating from this dataset’s structured supervision.
---
## How to Use
See our GitHub [repository](https://github.com/ahnjaewoo/MAC) for full usage instructions and scripts.
## Citation
Please cite our work if you find the resources in this repository useful:
```
@inproceedings{ahn2025mac,
title={Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates},
author={Jaewoo Ahn and Heeseung Yun and Dayoon Ko and Gunhee Kim},
booktitle={ACL},
year=2025
}
```
提供机构:
ahnpersie



