five

pupubeauty/FragFake

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/pupubeauty/FragFake
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: Gemini-IG-easy data_files: - split: test path: VLM_Dataset/Gemini-IG/easy/Gemini-IG_easy_test_conversation.json - split: train path: VLM_Dataset/Gemini-IG/easy/Gemini-IG_easy_train_conversation.json - config_name: Gemini-IG-hard data_files: - split: test path: VLM_Dataset/Gemini-IG/hard/Gemini-IG_hard_test_conversation.json - split: train path: VLM_Dataset/Gemini-IG/hard/Gemini-IG_hard_train_conversation.json - config_name: GoT-easy data_files: - split: test path: VLM_Dataset/GoT/easy/GoT_easy_test_conversation.json - split: train path: VLM_Dataset/GoT/easy/GoT_easy_train_conversation.json - config_name: GoT-hard data_files: - split: test path: VLM_Dataset/GoT/hard/GoT_hard_test_conversation.json - split: train path: VLM_Dataset/GoT/hard/GoT_hard_train_conversation.json - config_name: MagicBrush-easy data_files: - split: test path: VLM_Dataset/MagicBrush/easy/MagicBrush_easy_test_conversation.json - split: train path: VLM_Dataset/MagicBrush/easy/MagicBrush_easy_train_conversation.json - config_name: MagicBrush-hard data_files: - split: test path: VLM_Dataset/MagicBrush/hard/MagicBrush_hard_test_conversation.json - split: train path: VLM_Dataset/MagicBrush/hard/MagicBrush_hard_train_conversation.json - config_name: UltraEdit-easy data_files: - split: test path: VLM_Dataset/UltraEdit/easy/UltraEdit_easy_test_conversation.json - split: train path: VLM_Dataset/UltraEdit/easy/UltraEdit_easy_train_conversation.json - config_name: UltraEdit-hard data_files: - split: test path: VLM_Dataset/UltraEdit/hard/UltraEdit_hard_test_conversation.json - split: train path: VLM_Dataset/UltraEdit/hard/UltraEdit_hard_train_conversation.json license: apache-2.0 task_categories: - visual-question-answering language: - en tags: - Edited Image Detection - VLM pretty_name: FragFake size_categories: - 10K<n<100K --- # FragFake: VLM-Based Edited-Image Detection Dataset This repository contains four groups of examples—**Gemini-IG**, **GoT**, **MagicBrush**, and **UltraEdit**—each with two difficulty levels: **easy** and **hard**. The YAML front matter above tells the HF Dataset Viewer to expose eight configurations in the “Configurations” dropdown. Once you select a configuration, you’ll see its single `instruction` split. ## Sampling Policy for Edited Images To prevent potential privacy or content leakage, only one edited version is retained per original image: - In the source data, some original images have two edited versions (e.g., object addition and object replacement). - We randomly select and retain only one of them in the test set. The other version and its corresponding conversation are discarded (not include in the train set). - As a result, the each hard versions may contain slightly fewer edited-image conversations, because not every original image has two valid modifications in hard version instructions. ## Usage Example ```python from datasets import load_dataset # Load the UltraEdit-hard configuration ds = load_dataset( "Vincent-HKUSTGZ/FragFake", name="UltraEdit-hard" ) # Inspect the first record print(ds[0])

配置项: - 配置名称:Gemini-IG-easy 数据文件: - 划分:测试集 路径:VLM_Dataset/Gemini-IG/easy/Gemini-IG_easy_test_conversation.json - 划分:训练集 路径:VLM_Dataset/Gemini-IG/easy/Gemini-IG_easy_train_conversation.json - 配置名称:Gemini-IG-hard 数据文件: - 划分:测试集 路径:VLM_Dataset/Gemini-IG/hard/Gemini-IG_hard_test_conversation.json - 划分:训练集 路径:VLM_Dataset/Gemini-IG/hard/Gemini-IG_hard_train_conversation.json - 配置名称:GoT-easy 数据文件: - 划分:测试集 路径:VLM_Dataset/GoT/easy/GoT_easy_test_conversation.json - 划分:训练集 路径:VLM_Dataset/GoT/easy/GoT_easy_train_conversation.json - 配置名称:GoT-hard 数据文件: - 划分:测试集 路径:VLM_Dataset/GoT/hard/GoT_hard_test_conversation.json - 划分:训练集 路径:VLM_Dataset/GoT/hard/GoT_hard_train_conversation.json - 配置名称:MagicBrush-easy 数据文件: - 划分:测试集 路径:VLM_Dataset/MagicBrush/easy/MagicBrush_easy_test_conversation.json - 划分:训练集 路径:VLM_Dataset/MagicBrush/easy/MagicBrush_easy_train_conversation.json - 配置名称:MagicBrush-hard 数据文件: - 划分:测试集 路径:VLM_Dataset/MagicBrush/hard/MagicBrush_hard_test_conversation.json - 划分:训练集 路径:VLM_Dataset/MagicBrush/hard/MagicBrush_hard_train_conversation.json - 配置名称:UltraEdit-easy 数据文件: - 划分:测试集 路径:VLM_Dataset/UltraEdit/easy/UltraEdit_easy_test_conversation.json - 划分:训练集 路径:VLM_Dataset/UltraEdit/easy/UltraEdit_easy_train_conversation.json - 配置名称:UltraEdit-hard 数据文件: - 划分:测试集 路径:VLM_Dataset/UltraEdit/hard/UltraEdit_hard_test_conversation.json - 划分:训练集 路径:VLM_Dataset/UltraEdit/hard/UltraEdit_hard_train_conversation.json 许可证:apache-2.0 任务类别: - 视觉问答(visual-question-answering) 语言: - 英语 标签: - 编辑图像检测(Edited Image Detection) - 视觉语言模型(VLM,Vision-Language Model) 展示名称:FragFake 样本规模类别: - 10K<n<100K # FragFake:基于视觉语言模型(VLM,Vision-Language Model)的编辑图像检测数据集 本仓库包含四组示例——**Gemini-IG**、**GoT**、**MagicBrush**与**UltraEdit**,每组均设有简单与困难两个难度层级。上述YAML前置元数据将告知Hugging Face数据集查看器在“配置”下拉菜单中展示8个配置项。选中任一配置后,即可查看其对应的单一`instruction`划分。 ## 编辑图像采样策略 为避免潜在的隐私或内容泄露风险,每张原始图像仅保留一个编辑版本: - 在源数据中,部分原始图像存在两个编辑版本(例如对象添加与对象替换)。 - 我们在测试集中仅随机选取并保留其中一个版本,其余版本及其对应对话均被舍弃(不纳入训练集)。 - 因此,各困难版本所包含的编辑图像对话数量可能略少,因为并非每张原始图像在困难版本的指令中都存在两处有效修改。 ## 使用示例 python from datasets import load_dataset # 加载 UltraEdit-hard 配置 ds = load_dataset( "Vincent-HKUSTGZ/FragFake", name="UltraEdit-hard" ) # 查看第一条数据记录 print(ds[0])
提供机构:
pupubeauty
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作