five

yilingwang/visual-contrastive-dpo-dataset

收藏
Hugging Face2026-03-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/yilingwang/visual-contrastive-dpo-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-4.0 task_categories: - image-text-to-text - visual-question-answering language: - en tags: - multimodal - dpo - hallucination - preference-learning - vlm size_categories: - 1K<n<10K --- # Visual Contrastive DPO Dataset Hallucination-aligned visual negative construction dataset for multimodal DPO training. ## Dataset Summary This dataset contains **8,456 preference pairs** with **7,971 generated negative images** for cross-modal DPO training on VLMs. Each negative image is generated to visually depict the hallucinated content from the model's rejected response. ## Data Construction Pipeline 1. **Self-rollout sampling**: Qwen2.5-VL-3B generates 8 candidate responses per (image, question) pair 2. **Judge scoring**: Qwen3-8B scores each response (0-10) against ground truth 3. **Pair selection**: Filter pairs with chosen score ≥ 6, rejected score ≤ 5, margin ≥ 4 4. **Semantic extraction**: Extract key differences between chosen/rejected responses 5. **Image generation**: Qwen-Image generates negative images matching rejected (hallucinated) content ## Files - `rlhfv_ovip_dpo.jsonl` - Full dataset (8,456 pairs) - `rlhfv_ovip_dpo_filtered.jsonl` - Filtered dataset (3,493 pairs, removed length-biased and reference-fallback pairs) - `images/` - Original images from RLAIF-V and RLHF-V - `generated_images/` - Generated negative images (384×384) ## Data Format ```json { "image": "images/rlhf/1640.jpg", "question": "Is there only one person visible?", "chosen": "Yes, only one person is visible.", "rejected": "No, there is no person visible.", "id": "rlhf_1640", "edited_image_path": "generated_images/rlhf_1640_neg.png", "edit_original": "one person visible", "edit_modified": "no person visible", "edit_difference": "presence vs absence of person" } ``` ## Source Data - RLAIF-V (openbmb/RLAIF-V-Dataset): ~10K sampled - RLHF-V (openbmb/RLHF-V-Dataset): ~5.7K full ## Models Used | Component | Model | |-----------|-------| | Sampling | Qwen2.5-VL-3B-Instruct | | Judge | Qwen3-8B | | Image Generation | Tongyi-MAI/Z-Image-Turbo |
提供机构:
yilingwang
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作