StegaMultiPayload-2026: A Multi-Strategy and Multi-Payload Image Steganalysis Dataset
收藏DataCite Commons2026-05-06 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20048039
下载链接
链接失效反馈官方服务:
资源简介:
StegaMultiPayload-2026: A Multi-Strategy and Multi-Payload Image Steganalysis Dataset
1. Overview
The dataset is designed to support multi-task steganalysis, including detection, algorithm classification, and payload-length estimation. It consists of two primary components: cover images and stego images, organized in a hierarchical and label-consistent structure to enable supervised learning.
1.1 Dataset Directory Structure
The dataset root directory, StegaMultiPayload-2026, contains two primary folders and two metadata files:
StegaMultiPayload-2026/
│
├── cover_data/ (25,000 cover images)
├── stego_data/ (stego images organized hierarchically)
│
├── cover_dataset_index.csv
└── stego_label_metadata.csv
This structure ensures clear separation between original (cover) and modified (stego) images, along with corresponding metadata files for indexing and labeling.
2. Cover Data
The cover dataset (cover_data/) contains 25,000 original, unmodified grayscale images used as the baseline for embedding.
Format: PNG
Colour space: Grayscale (single channel)
Resolution: 512 x 512, 256 x 256
Content: Natural images without any embedded payload
Purpose:
Ø Negative class for steganalysis (no hidden data)
Ø Reference for distortion comparison
3. Stego Data Structure
The stego dataset (stego_data/) is systematically organized based on embedding strategy and payload size.
3.1 Hierarchical Organization
stego_data/
│
├── LSB/
│ ├── 5_words/ (1000 images)
│ ├── 10_words/ (1000 images)
│ ├── 15_words/ (1000 images)
│ ├── 20_words/ (1000 images)
│ └── 25_words/ (1000 images)
│
├── WOW-inspired/
│ ├── 5_words/ (1000 images)
│ ├── 10_words/ (1000 images)
│ ├── 15_words/ (1000 images)
│ ├── 20_words/ (1000 images)
│ └── 25_words/ (1000 images)
│
├── HILL-inspired/
│ ├── 5_words/ (1000 images)
│ ├── 10_words/ (1000 images)
│ ├── 15_words/ (1000 images)
│ ├── 20_words/ (1000 images)
│ └── 25_words/ (1000 images)
│
├── S-UNIWARD-inspired/
│ ├── 5_words/ (1000 images)
│ ├── 10_words/ (1000 images)
│ ├── 15_words/ (1000 images)
│ ├── 20_words/ (1000 images)
│ └── 25_words/ (1000 images)
│
└── HUGO-inspired/
├── 5_words/ (1000 images)
├── 10_words/ (1000 images)
├── 15_words/ (1000 images)
├── 20_words/ (1000 images)
└── 25_words/ (1000 images)
Each embedding strategy consists of five payload size categories (5, 10, 15, 20, and 25 words), and each payload size folder contains 1,000 stego images, resulting in 5,000 images per embedding strategy.
The hierarchical structure enables both fine-grained (payload-level) and coarse-grained (algorithm-level) analysis.
4. Dataset Composition
The dataset is designed to be balanced across all embedding strategies and payload categories, ensuring fair evaluation for steganalysis tasks.
Each embedding algorithm contains 5 payload categories, and each payload category contains 1,000 images, resulting in 5,000 stego images per algorithm.
In addition, the dataset includes a separate set of 25,000 cover images.
Category
Subcategory
Count
Size
Cover Data
Total Cover Images
25,000
3.41 GB
Stego Data
Embedding Algorithms
5
2.33 GB
Payload Categories per Algorithm
5
Images per category (per algorithm)
1,000
Images per Algorithm
5,000
Total Stego Images
25,000
Overall Dataset
Total Images (Cover + Stego)
50,000
5.74 GB
The dataset maintains a one-to-one correspondence between cover and stego images, where each stego image is generated from a specific cover image. The dataset is fully balanced across embedding strategies and payload categories, ensuring unbiased evaluation for classification and detection tasks.
5. Image Properties
Ø Format: PNG
Ø Type: Grayscale
Ø Resolution: 512 x 512, 256 x 256
Ø Bit-depth: 8-bit
Ø Embedding: 1 bit per selected pixel
6. Stego Image File Naming Convention
A consistent naming scheme is used to encode metadata directly in stego image filenames, enabling easy identification and traceability:
<ImageId> _<Cover_image_name>_<algorithm>_<payload>.png
Example: P00001_BOWS_03885.png_LSB_W5.png
Ø algorithm → embedding strategy used
Ø payload → word-length category (5_words, 10_words, etc.)
Ø imageID → unique identifier assigned
Ø CoverImageName → Original cover image filename
7. Annotation and Excel Representation
To facilitate supervised learning, reproducibility, and structured access, the dataset includes two metadata files in CSV format:
7.1 cover_dataset_index.csv
This file provides indexing information for all cover images stored in the cover_data/ directory.
Fields include:
ImageId: Unique identifier for each image
Dataset_Name: Source dataset (e.g., BOSSBase, BOWS, ALASKA)
7.2 Stego_label_metadata.csv
This file contains detailed annotations for all stego images stored under the stego_data/ directory.
Fields include:
· Id: Unique identifier for each stego image
· cover_image: Corresponding original cover image filename
· stego_image: Generated stego image filename
· algorithm: Embedding strategy used (LSB, WOW-inspired, etc.)
· payload_id : Encoded payload category (e.g., 5_words, 10_words)
· length: Payload length
· height: Image dimensions
· width: Image dimensions
· retries: Number of embedding attempts
· status: Embedding success status
The metadata files enable multi-task learning, including steganography detection, algorithm classification, and payload estimation.
This structured annotation supports multi-task learning, including detection, classification, and payload estimation.
8. Embedding Strategies (Inspired Variants)
The dataset uses inspired variants, not exact implementations of classical algorithms. This distinction is important and should be clearly stated.
8.1 LSB (Baseline)
Pixel selection: Random
Behavior: Uniform noise distribution
Role: Baseline for comparison
8.2 WOW-inspired (Texture-Based)
Pixel selection: High-gradient regions
Behavior: Embedding in edges and textures
Goal: Reduce visible distortion
8.3 HILL-inspired (Smooth-Based)
Pixel selection: Low-gradient regions
Behavior: Embedding in smooth areas
Goal: Evaluate subtle modifications
8.4 S-UNIWARD-inspired (Uniform Distribution)
Pixel selection: Even spatial spread
Behavior: Balanced embedding
Goal: Avoid spatial bias
8.5 HUGO-inspired (Controlled Random)
Pixel selection: Random with spacing constraint
Behavior: Non-clustered embedding
Goal: Preserve statistical consistency
9. Embedding Mechanism
Technique: Least Significant Bit (LSB) substitution
Each selected pixel carries 1 bit of payload
Embedding follows a sequential order over selected pixel positions
提供机构:
Zenodo
创建时间:
2026-05-06



