TheKernel01/AIGIBench

Name: TheKernel01/AIGIBench
Creator: TheKernel01
Published: 2026-04-03 10:57:47
License: 暂无描述

Hugging Face2026-04-03 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/TheKernel01/AIGIBench

下载链接

链接失效反馈

官方服务：

资源简介：

--- configs: - config_name: default data_files: - split: train path: data/train-* - split: validation path: data/validation-* dataset_info: features: - name: image dtype: image - name: label dtype: class_label: names: '0': real '1': fake - name: generator dtype: class_label: names: '0': Real '1': ProGAN '2': SD14 splits: - name: train num_bytes: 53948577285 num_examples: 288000 - name: validation num_bytes: 4171349460 num_examples: 20000 download_size: 60520430840 dataset_size: 58119926745 license: cc-by-nc-sa-4.0 task_categories: - image-classification language: - en pretty_name: p --- # AIGIBench Dataset ## 📝 Dataset Description ### Dataset Summary **AIGIBench** is a comprehensive image collection designed to benchmark the effectiveness of detection algorithms against artificial intelligence generated images (AIGIs). Based on the research paper _"Is Artificial Intelligence Generated Image Detection a Solved Problem?"_ (NeurIPS 2025), this dataset provides a rigorous testing ground for binary veracity classification and multi-model source attribution. The dataset includes 288,000 training samples and 20,000 validation samples, featuring high-quality real photographs contrasted against images generated by prominent architectures like **ProGAN** and **Stable Diffusion 1.4 (SD14)**. ### Supported Tasks |**Task ID**|**Task Name**|**Description**|**Output Classes**| |---|---|---|---| |**Task A**|Binary Veracity Classification|Classifying images as either real or fake (AI-generated).|2 (real, fake)| |**Task B**|AI Model Source Identification|Identifying the specific origin of the image (Real vs. ProGAN vs. SD14).|3 (Real, ProGAN, SD14)| ### Languages The descriptive text, labels, and metadata are provided in **English (en)**. ### Data Splits | **Split** | **Number of Instances** | **Notes** | | -------------- | ----------------------- | -------------------------------------------------- | | **train** | 288,000 | Used for model training and feature extraction. | | **validation** | 20,000 | Used for hyperparameter tuning and early stopping. | ### 💾 Dataset Structure ### Data Instances A single data instance consists of an image and two categorical labels identifying its authenticity and its specific generative source. | **Field Name** | **Example Value** | **Description** | | -------------- | -------------------------- | ---------------------------------------------------------- | | `image` | `<PIL.Image.Image object>` | The actual image content loaded into a PIL object. | | `label` | `1` | Binary label for authenticity (Real vs. Fake). | | `generator` | `2` | Label specifying the generation source model (e.g., SD14). | ### Data Fields | **Field Name** | **Data Type** | **Description** | | -------------- | --------------------- | --------------------------------------------------------- | | `image` | `datasets.Image()` | The actual image content. | | `label` | `datasets.ClassLabel` | **Task A:** Binary label for image veracity. | | `generator` | `datasets.ClassLabel` | **Task B:** Label specifying the generation source/model. | ### Label Definitions #### label (Binary Veracity Classification) |**Label**|**Value**|**Description**| |---|---|---| |**real**|0|Image is an authentic photograph.| |**fake**|1|Image was generated by an AI model.| #### generator (Model Source Identification) |**Label**|**Value**|**Description**| |---|---|---| |**Real**|0|Authentic photograph.| |**ProGAN**|1|Generated using Progressive Growing of GANs.| |**SD14**|2|Generated using Stable Diffusion version 1.4.| ### 🔗 Sources - **Original Dataset:** [HorizonTEL/AIGIBench](https://huggingface.co/datasets/HorizonTEL/AIGIBench) - **Research Paper:** _Is Artificial Intelligence Generated Image Detection a Solved Problem?_ (NeurIPS 2025). - **License:** Creative Commons Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0).

提供机构：

TheKernel01

5,000+

优质数据集

54 个

任务类型

进入经典数据集