TheKernel01/AIGC-Detection-Benchmark
收藏Hugging Face2026-04-03 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/TheKernel01/AIGC-Detection-Benchmark
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: test
path: data/test-*
dataset_info:
features:
- name: image
dtype: image
- name: label
dtype:
class_label:
names:
'0': real
'1': fake
- name: generator
dtype:
class_label:
names:
'0': Real
'1': ADM
'2': BigGAN
'3': CycleGAN
'4': DALLE2
'5': GauGAN
'6': GLIDE
'7': Midjourney
'8': ProGAN
'9': SD14
'10': SD15
'11': SDXL
'12': StarGAN
'13': StyleGAN
'14': StyleGAN2
'15': VQDM
'16': WhichFaceIsReal
'17': Wukong
splits:
- name: test
num_bytes: 29870625563
num_examples: 125026
download_size: 32032878953
dataset_size: 29870625563
license: apache-2.0
task_categories:
- image-classification
language:
- en
---
# AIGC Detection Benchmark Dataset
## 📝 Dataset Description
**Dataset Summary**
The AIGC Detection Benchmark Dataset is a high-quality collection of images and associated metadata designed to benchmark models for detecting and identifying the source of artificially generated content. The dataset contains a mix of real-world images and images generated by a wide array of prominent AI models, including diffusion models (like Stable Diffusion, DALL-E 2, Midjourney, ADM) and GANs (like BigGAN, StyleGAN, ProGAN).
Each image is meticulously labeled under two categories, enabling researchers to tackle two distinct, high-value computer vision tasks: binary real/fake classification and multi-class source model identification. **Note: This specific version of the dataset is designed exclusively for testing and evaluation purposes, with all data consolidated into a single test split.**
**Supported Tasks and Leaderboards**
This dataset directly supports two critical image classification tasks:
|**Task ID**|**Task Name**|**Description**| **Output Classes** |
| ----------- | ------------------------------ | ------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
|**Task A**|Binary Veracity Classification|Classifying images as either real or fake.| 2 (real, fake) |
|**Task B**|AI Model Source Identification|Identifying the specific AI generation model used for images labeled as AI-Generated.| 18 (Real, ADM, BigGAN, CycleGAN, DALLE2, GauGAN, GLIDE, Midjourney, ProGAN, SD14, SD15, SDXL, StarGAN, StyleGAN, StyleGAN2, VQDM, WhichFaceIsReal, Wukong) |
**Languages**
The descriptive text, including all captions, is in English (en).
## 🗂️ Data Splits
All instances have been merged into a single test split to serve strictly as an evaluation benchmark.
|**Split**|**Number of Instances**|**Notes**|
|---|---|---|
|**test**|125,026|Used exclusively for final, unbiased model evaluation and benchmarking.|
## 💾 Dataset Structure
**Data Instances**
A single data instance consists of an image file and two distinct labels detailing its source and authenticity.
|**Field Name**|**Example Value**|**Description**|
|---|---|---|
|**image**|`<PIL.Image.Image object>`|The actual image content loaded into a PIL object.|
|**label**|`1`|Binary label for authenticity (Real vs. AI-Generated).|
|**generator**|`3`|Multi-class label for the specific generation model (or Real).|
**Data Fields**
The dataset contains the following fields:
|**Field Name**|**Data Type**|**Description**|
|---|---|---|
|**image**|`datasets.Image()`|The actual image content (e.g., .jpg, .png).|
|**label**|`datasets.ClassLabel`|Task A: Binary label for image veracity.|
|**generator**|`datasets.ClassLabel`|Task B: Label specifying the generation source/model.|
## 🏷️ Label Definitions
The two label fields use the following mappings:
**`label` (Binary Veracity Classification)**
|**Label**|**Value**|**Description**|
|---|---|---|
|**real**|`0`|Image is a real photograph/non-AI generated.|
|**fake**|`1`|Image was created by an AI generation model.|
**`generator` (Model Source Identification)**
| **Label** | **Value** | **Description** |
| ------------------- | --------- | --------------------------------------------------------- |
| **Real** | `0` | Real image (no AI generation involved). |
| **ADM** | `1` | Generated by Ablated Diffusion Model (Guided Diffusion). |
| **BigGAN** | `2` | Generated by BigGAN. |
| **CycleGAN** | `3` | Generated by CycleGAN. |
| **DALLE2** | `4` | Generated by OpenAI's DALL-E 2. |
| **GauGAN** | `5` | Generated by GauGAN (SPADE). |
| **GLIDE** | `6` | Generated by GLIDE. |
| **Midjourney** | `7` | Generated by Midjourney. |
| **ProGAN** | `8` | Generated by ProGAN (Progressive GAN). |
| **SD14** | `9` | Generated by Stable Diffusion 1.4. |
| **SD15** | `10` | Generated by Stable Diffusion 1.5. |
| **SDXL** | `11` | Generated by Stable Diffusion XL. |
| **StarGAN** | `12` | Generated by StarGAN. |
| **StyleGAN** | `13` | Generated by StyleGAN. |
| **StyleGAN2** | `14` | Generated by StyleGAN2. |
| **VQDM** | `15` | Generated by Vector Quantized Diffusion Model. |
| **WhichFaceIsReal** | `16` | Real human face sourced from the WhichFaceIsReal dataset. |
| **Wukong** | `17` | Generated by the Wukong diffusion model. |
## 🔗 Sources
- **Original dataset**: [Ekko-zn/AIGCDetectBenchmark](https://github.com/Ekko-zn/AIGCDetectBenchmark).
提供机构:
TheKernel01



