Name: TheKernel01/AIGC-Detection-Benchmark
Creator: TheKernel01
Published: 2026-04-03 12:20:38
License: 暂无描述

下载链接：

https://hf-mirror.com/datasets/TheKernel01/AIGC-Detection-Benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

--- configs: - config_name: default data_files: - split: test path: data/test-* dataset_info: features: - name: image dtype: image - name: label dtype: class_label: names: '0': real '1': fake - name: generator dtype: class_label: names: '0': Real '1': ADM '2': BigGAN '3': CycleGAN '4': DALLE2 '5': GauGAN '6': GLIDE '7': Midjourney '8': ProGAN '9': SD14 '10': SD15 '11': SDXL '12': StarGAN '13': StyleGAN '14': StyleGAN2 '15': VQDM '16': WhichFaceIsReal '17': Wukong splits: - name: test num_bytes: 29870625563 num_examples: 125026 download_size: 32032878953 dataset_size: 29870625563 license: apache-2.0 task_categories: - image-classification language: - en --- # AIGC Detection Benchmark Dataset ## 📝 Dataset Description **Dataset Summary** The AIGC Detection Benchmark Dataset is a high-quality collection of images and associated metadata designed to benchmark models for detecting and identifying the source of artificially generated content. The dataset contains a mix of real-world images and images generated by a wide array of prominent AI models, including diffusion models (like Stable Diffusion, DALL-E 2, Midjourney, ADM) and GANs (like BigGAN, StyleGAN, ProGAN). Each image is meticulously labeled under two categories, enabling researchers to tackle two distinct, high-value computer vision tasks: binary real/fake classification and multi-class source model identification. **Note: This specific version of the dataset is designed exclusively for testing and evaluation purposes, with all data consolidated into a single test split.** **Supported Tasks and Leaderboards** This dataset directly supports two critical image classification tasks: |**Task ID**|**Task Name**|**Description**| **Output Classes** | | ----------- | ------------------------------ | ------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- | |**Task A**|Binary Veracity Classification|Classifying images as either real or fake.| 2 (real, fake) | |**Task B**|AI Model Source Identification|Identifying the specific AI generation model used for images labeled as AI-Generated.| 18 (Real, ADM, BigGAN, CycleGAN, DALLE2, GauGAN, GLIDE, Midjourney, ProGAN, SD14, SD15, SDXL, StarGAN, StyleGAN, StyleGAN2, VQDM, WhichFaceIsReal, Wukong) | **Languages** The descriptive text, including all captions, is in English (en). ## 🗂️ Data Splits All instances have been merged into a single test split to serve strictly as an evaluation benchmark. |**Split**|**Number of Instances**|**Notes**| |---|---|---| |**test**|125,026|Used exclusively for final, unbiased model evaluation and benchmarking.| ## 💾 Dataset Structure **Data Instances** A single data instance consists of an image file and two distinct labels detailing its source and authenticity. |**Field Name**|**Example Value**|**Description**| |---|---|---| |**image**|`<PIL.Image.Image object>`|The actual image content loaded into a PIL object.| |**label**|`1`|Binary label for authenticity (Real vs. AI-Generated).| |**generator**|`3`|Multi-class label for the specific generation model (or Real).| **Data Fields** The dataset contains the following fields: |**Field Name**|**Data Type**|**Description**| |---|---|---| |**image**|`datasets.Image()`|The actual image content (e.g., .jpg, .png).| |**label**|`datasets.ClassLabel`|Task A: Binary label for image veracity.| |**generator**|`datasets.ClassLabel`|Task B: Label specifying the generation source/model.| ## 🏷️ Label Definitions The two label fields use the following mappings: **`label` (Binary Veracity Classification)** |**Label**|**Value**|**Description**| |---|---|---| |**real**|`0`|Image is a real photograph/non-AI generated.| |**fake**|`1`|Image was created by an AI generation model.| **`generator` (Model Source Identification)** | **Label** | **Value** | **Description** | | ------------------- | --------- | --------------------------------------------------------- | | **Real** | `0` | Real image (no AI generation involved). | | **ADM** | `1` | Generated by Ablated Diffusion Model (Guided Diffusion). | | **BigGAN** | `2` | Generated by BigGAN. | | **CycleGAN** | `3` | Generated by CycleGAN. | | **DALLE2** | `4` | Generated by OpenAI's DALL-E 2. | | **GauGAN** | `5` | Generated by GauGAN (SPADE). | | **GLIDE** | `6` | Generated by GLIDE. | | **Midjourney** | `7` | Generated by Midjourney. | | **ProGAN** | `8` | Generated by ProGAN (Progressive GAN). | | **SD14** | `9` | Generated by Stable Diffusion 1.4. | | **SD15** | `10` | Generated by Stable Diffusion 1.5. | | **SDXL** | `11` | Generated by Stable Diffusion XL. | | **StarGAN** | `12` | Generated by StarGAN. | | **StyleGAN** | `13` | Generated by StyleGAN. | | **StyleGAN2** | `14` | Generated by StyleGAN2. | | **VQDM** | `15` | Generated by Vector Quantized Diffusion Model. | | **WhichFaceIsReal** | `16` | Real human face sourced from the WhichFaceIsReal dataset. | | **Wukong** | `17` | Generated by the Wukong diffusion model. | ## 🔗 Sources - **Original dataset**: [Ekko-zn/AIGCDetectBenchmark](https://github.com/Ekko-zn/AIGCDetectBenchmark).

应用场景：