pupubeauty/FragFake
收藏Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/pupubeauty/FragFake
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: Gemini-IG-easy
data_files:
- split: test
path: VLM_Dataset/Gemini-IG/easy/Gemini-IG_easy_test_conversation.json
- split: train
path: VLM_Dataset/Gemini-IG/easy/Gemini-IG_easy_train_conversation.json
- config_name: Gemini-IG-hard
data_files:
- split: test
path: VLM_Dataset/Gemini-IG/hard/Gemini-IG_hard_test_conversation.json
- split: train
path: VLM_Dataset/Gemini-IG/hard/Gemini-IG_hard_train_conversation.json
- config_name: GoT-easy
data_files:
- split: test
path: VLM_Dataset/GoT/easy/GoT_easy_test_conversation.json
- split: train
path: VLM_Dataset/GoT/easy/GoT_easy_train_conversation.json
- config_name: GoT-hard
data_files:
- split: test
path: VLM_Dataset/GoT/hard/GoT_hard_test_conversation.json
- split: train
path: VLM_Dataset/GoT/hard/GoT_hard_train_conversation.json
- config_name: MagicBrush-easy
data_files:
- split: test
path: VLM_Dataset/MagicBrush/easy/MagicBrush_easy_test_conversation.json
- split: train
path: VLM_Dataset/MagicBrush/easy/MagicBrush_easy_train_conversation.json
- config_name: MagicBrush-hard
data_files:
- split: test
path: VLM_Dataset/MagicBrush/hard/MagicBrush_hard_test_conversation.json
- split: train
path: VLM_Dataset/MagicBrush/hard/MagicBrush_hard_train_conversation.json
- config_name: UltraEdit-easy
data_files:
- split: test
path: VLM_Dataset/UltraEdit/easy/UltraEdit_easy_test_conversation.json
- split: train
path: VLM_Dataset/UltraEdit/easy/UltraEdit_easy_train_conversation.json
- config_name: UltraEdit-hard
data_files:
- split: test
path: VLM_Dataset/UltraEdit/hard/UltraEdit_hard_test_conversation.json
- split: train
path: VLM_Dataset/UltraEdit/hard/UltraEdit_hard_train_conversation.json
license: apache-2.0
task_categories:
- visual-question-answering
language:
- en
tags:
- Edited Image Detection
- VLM
pretty_name: FragFake
size_categories:
- 10K<n<100K
---
# FragFake: VLM-Based Edited-Image Detection Dataset
This repository contains four groups of examples—**Gemini-IG**, **GoT**, **MagicBrush**, and **UltraEdit**—each with two difficulty levels: **easy** and **hard**. The YAML front matter above tells the HF Dataset Viewer to expose eight configurations in the “Configurations” dropdown. Once you select a configuration, you’ll see its single `instruction` split.
## Sampling Policy for Edited Images
To prevent potential privacy or content leakage, only one edited version is retained per original image:
- In the source data, some original images have two edited versions (e.g., object addition and object replacement).
- We randomly select and retain only one of them in the test set. The other version and its corresponding conversation are discarded (not include in the train set).
- As a result, the each hard versions may contain slightly fewer edited-image conversations, because not every original image has two valid modifications in hard version instructions.
## Usage Example
```python
from datasets import load_dataset
# Load the UltraEdit-hard configuration
ds = load_dataset(
"Vincent-HKUSTGZ/FragFake",
name="UltraEdit-hard"
)
# Inspect the first record
print(ds[0])
配置项:
- 配置名称:Gemini-IG-easy
数据文件:
- 划分:测试集
路径:VLM_Dataset/Gemini-IG/easy/Gemini-IG_easy_test_conversation.json
- 划分:训练集
路径:VLM_Dataset/Gemini-IG/easy/Gemini-IG_easy_train_conversation.json
- 配置名称:Gemini-IG-hard
数据文件:
- 划分:测试集
路径:VLM_Dataset/Gemini-IG/hard/Gemini-IG_hard_test_conversation.json
- 划分:训练集
路径:VLM_Dataset/Gemini-IG/hard/Gemini-IG_hard_train_conversation.json
- 配置名称:GoT-easy
数据文件:
- 划分:测试集
路径:VLM_Dataset/GoT/easy/GoT_easy_test_conversation.json
- 划分:训练集
路径:VLM_Dataset/GoT/easy/GoT_easy_train_conversation.json
- 配置名称:GoT-hard
数据文件:
- 划分:测试集
路径:VLM_Dataset/GoT/hard/GoT_hard_test_conversation.json
- 划分:训练集
路径:VLM_Dataset/GoT/hard/GoT_hard_train_conversation.json
- 配置名称:MagicBrush-easy
数据文件:
- 划分:测试集
路径:VLM_Dataset/MagicBrush/easy/MagicBrush_easy_test_conversation.json
- 划分:训练集
路径:VLM_Dataset/MagicBrush/easy/MagicBrush_easy_train_conversation.json
- 配置名称:MagicBrush-hard
数据文件:
- 划分:测试集
路径:VLM_Dataset/MagicBrush/hard/MagicBrush_hard_test_conversation.json
- 划分:训练集
路径:VLM_Dataset/MagicBrush/hard/MagicBrush_hard_train_conversation.json
- 配置名称:UltraEdit-easy
数据文件:
- 划分:测试集
路径:VLM_Dataset/UltraEdit/easy/UltraEdit_easy_test_conversation.json
- 划分:训练集
路径:VLM_Dataset/UltraEdit/easy/UltraEdit_easy_train_conversation.json
- 配置名称:UltraEdit-hard
数据文件:
- 划分:测试集
路径:VLM_Dataset/UltraEdit/hard/UltraEdit_hard_test_conversation.json
- 划分:训练集
路径:VLM_Dataset/UltraEdit/hard/UltraEdit_hard_train_conversation.json
许可证:apache-2.0
任务类别:
- 视觉问答(visual-question-answering)
语言:
- 英语
标签:
- 编辑图像检测(Edited Image Detection)
- 视觉语言模型(VLM,Vision-Language Model)
展示名称:FragFake
样本规模类别:
- 10K<n<100K
# FragFake:基于视觉语言模型(VLM,Vision-Language Model)的编辑图像检测数据集
本仓库包含四组示例——**Gemini-IG**、**GoT**、**MagicBrush**与**UltraEdit**,每组均设有简单与困难两个难度层级。上述YAML前置元数据将告知Hugging Face数据集查看器在“配置”下拉菜单中展示8个配置项。选中任一配置后,即可查看其对应的单一`instruction`划分。
## 编辑图像采样策略
为避免潜在的隐私或内容泄露风险,每张原始图像仅保留一个编辑版本:
- 在源数据中,部分原始图像存在两个编辑版本(例如对象添加与对象替换)。
- 我们在测试集中仅随机选取并保留其中一个版本,其余版本及其对应对话均被舍弃(不纳入训练集)。
- 因此,各困难版本所包含的编辑图像对话数量可能略少,因为并非每张原始图像在困难版本的指令中都存在两处有效修改。
## 使用示例
python
from datasets import load_dataset
# 加载 UltraEdit-hard 配置
ds = load_dataset(
"Vincent-HKUSTGZ/FragFake",
name="UltraEdit-hard"
)
# 查看第一条数据记录
print(ds[0])
提供机构:
pupubeauty



