TheKernel01/AIGIBench
收藏Hugging Face2026-04-03 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/TheKernel01/AIGIBench
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: validation
path: data/validation-*
dataset_info:
features:
- name: image
dtype: image
- name: label
dtype:
class_label:
names:
'0': real
'1': fake
- name: generator
dtype:
class_label:
names:
'0': Real
'1': ProGAN
'2': SD14
splits:
- name: train
num_bytes: 53948577285
num_examples: 288000
- name: validation
num_bytes: 4171349460
num_examples: 20000
download_size: 60520430840
dataset_size: 58119926745
license: cc-by-nc-sa-4.0
task_categories:
- image-classification
language:
- en
pretty_name: p
---
# AIGIBench Dataset
## 📝 Dataset Description
### Dataset Summary
**AIGIBench** is a comprehensive image collection designed to benchmark the effectiveness of detection algorithms against artificial intelligence generated images (AIGIs). Based on the research paper _"Is Artificial Intelligence Generated Image Detection a Solved Problem?"_ (NeurIPS 2025), this dataset provides a rigorous testing ground for binary veracity classification and multi-model source attribution.
The dataset includes 288,000 training samples and 20,000 validation samples, featuring high-quality real photographs contrasted against images generated by prominent architectures like **ProGAN** and **Stable Diffusion 1.4 (SD14)**.
### Supported Tasks
|**Task ID**|**Task Name**|**Description**|**Output Classes**|
|---|---|---|---|
|**Task A**|Binary Veracity Classification|Classifying images as either real or fake (AI-generated).|2 (real, fake)|
|**Task B**|AI Model Source Identification|Identifying the specific origin of the image (Real vs. ProGAN vs. SD14).|3 (Real, ProGAN, SD14)|
### Languages
The descriptive text, labels, and metadata are provided in **English (en)**.
### Data Splits
| **Split** | **Number of Instances** | **Notes** |
| -------------- | ----------------------- | -------------------------------------------------- |
| **train** | 288,000 | Used for model training and feature extraction. |
| **validation** | 20,000 | Used for hyperparameter tuning and early stopping. |
### 💾 Dataset Structure
### Data Instances
A single data instance consists of an image and two categorical labels identifying its authenticity and its specific generative source.
| **Field Name** | **Example Value** | **Description** |
| -------------- | -------------------------- | ---------------------------------------------------------- |
| `image` | `<PIL.Image.Image object>` | The actual image content loaded into a PIL object. |
| `label` | `1` | Binary label for authenticity (Real vs. Fake). |
| `generator` | `2` | Label specifying the generation source model (e.g., SD14). |
### Data Fields
| **Field Name** | **Data Type** | **Description** |
| -------------- | --------------------- | --------------------------------------------------------- |
| `image` | `datasets.Image()` | The actual image content. |
| `label` | `datasets.ClassLabel` | **Task A:** Binary label for image veracity. |
| `generator` | `datasets.ClassLabel` | **Task B:** Label specifying the generation source/model. |
### Label Definitions
#### label (Binary Veracity Classification)
|**Label**|**Value**|**Description**|
|---|---|---|
|**real**|0|Image is an authentic photograph.|
|**fake**|1|Image was generated by an AI model.|
#### generator (Model Source Identification)
|**Label**|**Value**|**Description**|
|---|---|---|
|**Real**|0|Authentic photograph.|
|**ProGAN**|1|Generated using Progressive Growing of GANs.|
|**SD14**|2|Generated using Stable Diffusion version 1.4.|
### 🔗 Sources
- **Original Dataset:** [HorizonTEL/AIGIBench](https://huggingface.co/datasets/HorizonTEL/AIGIBench)
- **Research Paper:** _Is Artificial Intelligence Generated Image Detection a Solved Problem?_ (NeurIPS 2025).
- **License:** Creative Commons Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0).
提供机构:
TheKernel01



