Asketla/AIGC_Image_Steganography_Dataset
收藏Hugging Face2026-04-02 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Asketla/AIGC_Image_Steganography_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- image-classification
- unconditional-image-generation
tags:
- steganography
- aigc
- text-to-image
- deepfake-detection
- image-forensics
size_categories:
- 10K<n<100K
---
# AIGC Image Steganography Dataset
## 📖 Dataset Description
This dataset is specifically designed for research in Artificial Intelligence Generated Content (AIGC) image steganography, steganalysis, and image forensics.
To construct a highly diverse and standardized dataset, we selected **10 prominent domestic and international text-to-image (T2I) large models** and batch-generated the images via their official APIs.
During the generation process, we carefully defined **10 typical image styles**. For each style, every model was tasked with generating 500 images featuring diverse content. This systematic approach resulted in a comprehensive dataset of **50,000 images** in total (10 models × 10 styles × 500 images).
To ensure experimental uniformity and rigorously evaluate the performance of steganography algorithms, the resolution of all generated images was strictly fixed at **1024 × 1024**. This prevents resolution variations from introducing unwanted interference into steganographic experiments.
## 🤖 Source Models
The 50,000 images in this dataset were generated by the following 10 advanced T2I models:
1. **Baidu (百度)**
2. **LiblibAI (liblib)**
3. **Seedream 3.0 (火山大模型)**
4. **Seedream 4.0 (火山大模型)**
5. **Star3 Batch**
6. **Tencent Hunyuan (混元)**
7. **Kuaishou Kling (可灵)**
8. **Alibaba Qwen (千问)**
9. **iFLYTEK Spark (讯飞)**
10. **Zhipu AI (智谱)**
## 📁 Dataset Structure
The dataset is organized into 10 main directories based on the source T2I models. Each directory contains the generated images for that specific model.
```text
AIGC_Image_Steganography_Dataset/
├── img_baidu/ # 5,000 images
├── img_liblib/ # 5,000 images
├── img_seedream3.0/ # 5,000 images
├── img_seedream4.0/ # 5,000 images
├── img_star3_batch/ # 5,000 images
├── img_混元/ # 5,000 images
├── img_可灵/ # 5,000 images
├── img_千问/ # 5,000 images
├── img_讯飞/ # 5,000 images
└── img_智谱/ # 5,000 images
```
## 🚀 How to Use (Python)
You can easily download and load this dataset using the Hugging Face `datasets` library or the `huggingface_hub` tool.
### Method 1: Using the `datasets` library (Recommended for ML pipelines)
First, install the library in your terminal:
```bash
pip install datasets
```
Then, load the dataset in your Python script:
```python
from datasets import load_dataset
# Load the dataset
dataset = load_dataset("Asketla/AIGC_Image_Steganography_Dataset")
# Print the dataset information
print(dataset)
```
### Method 2: Downloading raw files via `huggingface_hub`
If you prefer to download the raw image folders directly to your local machine:
First, install the library in your terminal:
```bash
pip install huggingface_hub
```
Then, run the following Python script:
```python
from huggingface_hub import snapshot_download
# Download the entire dataset repository to a local directory
local_dir = snapshot_download(
repo_id="Asketla/AIGC_Image_Steganography_Dataset",
repo_type="dataset",
local_dir="./aigc_steganography_data", # Specify your desired local path
max_workers=4 # Adjust based on your network speed
)
print(f"Dataset successfully downloaded to: {local_dir}")
```
许可证: Apache-2.0
任务类别:
- 图像分类
- 无条件图像生成
标签:
- 隐写术
- AIGC
- 文本到图像
- 深度伪造检测
- 图像取证
规模类别:
- 10K<n<100K
# AIGC图像隐写数据集
## 📖 数据集描述
本数据集专为人工智能生成内容(Artificial Intelligence Generated Content, AIGC)图像隐写、隐写分析及图像取证研究设计。
为构建高多样性且标准化的数据集,我们选取了**10款国内外领先的文本到图像(Text-to-Image, T2I)大模型**,通过其官方API批量生成图像。
在生成过程中,我们精心定义了**10种典型图像风格**,针对每种风格,每个模型需生成500张内容多样的图像。通过该系统化流程,最终得到总计**50000张图像**的完整数据集(10个模型 × 10种风格 × 500张图像)。
为确保实验一致性并严格评估隐写算法性能,所有生成图像的分辨率均严格固定为**1024 × 1024**,以避免分辨率差异对隐写实验引入不必要的干扰。
## 🤖 源模型
本数据集的50000张图像由以下10款先进的T2I模型生成:
1. **百度(Baidu)**
2. **LiblibAI(liblib)**
3. **Seedream 3.0(火山大模型)**
4. **Seedream 4.0(火山大模型)**
5. **Star3 Batch**
6. **腾讯混元(Tencent Hunyuan)**
7. **快手可灵(Kuaishou Kling)**
8. **阿里千问(Alibaba Qwen)**
9. **讯飞星火(iFLYTEK Spark)**
10. **智谱AI(Zhipu AI)**
## 📁 数据集结构
本数据集基于源T2I模型划分为10个主目录,每个目录包含对应模型生成的图像。
text
AIGC_Image_Steganography_Dataset/
├── img_baidu/ # 5000张图像
├── img_liblib/ # 5000张图像
├── img_seedream3.0/ # 5000张图像
├── img_seedream4.0/ # 5000张图像
├── img_star3_batch/ # 5000张图像
├── img_混元/ # 5000张图像
├── img_可灵/ # 5000张图像
├── img_千问/ # 5000张图像
├── img_讯飞/ # 5000张图像
└── img_智谱/ # 5000张图像
## 🚀 使用方法(Python)
您可通过Hugging Face的`datasets`库或`huggingface_hub`工具轻松下载并加载本数据集。
### 方法1:使用`datasets`库(推荐用于机器学习流水线)
首先在终端安装该库:
bash
pip install datasets
随后在Python脚本中加载数据集:
python
from datasets import load_dataset
# 加载数据集
dataset = load_dataset("Asketla/AIGC_Image_Steganography_Dataset")
# 打印数据集信息
print(dataset)
### 方法2:通过`huggingface_hub`下载原始文件
若您希望直接将原始图像文件夹下载至本地设备:
首先在终端安装该库:
bash
pip install huggingface_hub
随后运行以下Python脚本:
python
from huggingface_hub import snapshot_download
# 将整个数据集仓库下载至本地目录
local_dir = snapshot_download(
repo_id="Asketla/AIGC_Image_Steganography_Dataset",
repo_type="dataset",
local_dir="./aigc_steganography_data", # 指定您期望的本地路径
max_workers=4 # 可根据网络速度调整
)
print(f"数据集已成功下载至:{local_dir}")
提供机构:
Asketla



