five

StegaMultiPayload-2026: A Multi-Strategy and Multi-Payload Image Steganalysis Dataset

收藏
DataCite Commons2026-05-06 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20048039
下载链接
链接失效反馈
官方服务:
资源简介:
StegaMultiPayload-2026: A Multi-Strategy and Multi-Payload Image Steganalysis Dataset 1. Overview The dataset is designed to support multi-task steganalysis, including detection, algorithm classification, and payload-length estimation. It consists of two primary components: cover images and stego images, organized in a hierarchical and label-consistent structure to enable supervised learning. 1.1  Dataset Directory Structure The dataset root directory, StegaMultiPayload-2026, contains two primary folders and two metadata files: StegaMultiPayload-2026/ │ ├── cover_data/                (25,000 cover images) ├── stego_data/                (stego images organized hierarchically) │ ├── cover_dataset_index.csv └── stego_label_metadata.csv This structure ensures clear separation between original (cover) and modified (stego) images, along with corresponding metadata files for indexing and labeling. 2. Cover Data The cover dataset (cover_data/) contains 25,000 original, unmodified grayscale images used as the baseline for embedding. Format: PNG Colour space: Grayscale (single channel) Resolution: 512 x 512, 256 x 256 Content: Natural images without any embedded payload Purpose:                         Ø  Negative class for steganalysis (no hidden data)                         Ø  Reference for distortion comparison   3. Stego Data Structure The stego dataset (stego_data/) is systematically organized based on embedding strategy and payload size. 3.1 Hierarchical Organization stego_data/ │ ├── LSB/ │   ├── 5_words/      (1000 images) │   ├── 10_words/     (1000 images) │   ├── 15_words/     (1000 images) │   ├── 20_words/     (1000 images) │   └── 25_words/     (1000 images) │ ├── WOW-inspired/ │   ├── 5_words/      (1000 images) │   ├── 10_words/     (1000 images) │   ├── 15_words/     (1000 images) │   ├── 20_words/     (1000 images) │   └── 25_words/     (1000 images) │ ├── HILL-inspired/ │   ├── 5_words/      (1000 images) │   ├── 10_words/     (1000 images) │   ├── 15_words/     (1000 images) │   ├── 20_words/     (1000 images) │   └── 25_words/     (1000 images) │ ├── S-UNIWARD-inspired/ │   ├── 5_words/      (1000 images) │   ├── 10_words/     (1000 images) │   ├── 15_words/     (1000 images) │   ├── 20_words/     (1000 images) │   └── 25_words/     (1000 images) │ └── HUGO-inspired/     ├── 5_words/      (1000 images)     ├── 10_words/     (1000 images)     ├── 15_words/     (1000 images)     ├── 20_words/     (1000 images)     └── 25_words/     (1000 images)   Each embedding strategy consists of five payload size categories (5, 10, 15, 20, and 25 words), and each payload size folder contains 1,000 stego images, resulting in 5,000 images per embedding strategy. The hierarchical structure enables both fine-grained (payload-level) and coarse-grained (algorithm-level) analysis.   4. Dataset Composition The dataset is designed to be balanced across all embedding strategies and payload categories, ensuring fair evaluation for steganalysis tasks. Each embedding algorithm contains 5 payload categories, and each payload category contains 1,000 images, resulting in 5,000 stego images per algorithm. In addition, the dataset includes a separate set of 25,000 cover images. Category Subcategory Count Size Cover Data Total Cover Images 25,000 3.41 GB Stego Data Embedding Algorithms 5 2.33 GB   Payload Categories per Algorithm 5   Images per category (per algorithm) 1,000   Images per Algorithm 5,000   Total Stego Images 25,000 Overall Dataset Total Images (Cover + Stego) 50,000 5.74 GB   The dataset maintains a one-to-one correspondence between cover and stego images, where each stego image is generated from a specific cover image. The dataset is fully balanced across embedding strategies and payload categories, ensuring unbiased evaluation for classification and detection tasks. 5. Image Properties Ø  Format: PNG Ø  Type: Grayscale Ø  Resolution: 512 x 512, 256 x 256 Ø  Bit-depth: 8-bit Ø  Embedding: 1 bit per selected pixel 6. Stego Image File Naming Convention A consistent naming scheme is used to encode metadata directly in stego image filenames, enabling easy identification and traceability: <ImageId> _<Cover_image_name>_<algorithm>_<payload>.png Example: P00001_BOWS_03885.png_LSB_W5.png Ø  algorithm → embedding strategy used Ø  payload → word-length category (5_words, 10_words, etc.) Ø  imageID → unique identifier assigned Ø  CoverImageName → Original cover image filename   7. Annotation and Excel Representation To facilitate supervised learning, reproducibility, and structured access, the dataset includes two metadata files in CSV format: 7.1 cover_dataset_index.csv This file provides indexing information for all cover images stored in the cover_data/ directory. Fields include: ImageId: Unique identifier for each image Dataset_Name: Source dataset (e.g., BOSSBase, BOWS, ALASKA) 7.2 Stego_label_metadata.csv This file contains detailed annotations for all stego images stored under the stego_data/ directory. Fields include: ·       Id:                                     Unique identifier for each stego image ·       cover_image:                    Corresponding original cover image filename     ·       stego_image:                    Generated stego image filename      ·       algorithm:                         Embedding strategy used (LSB, WOW-inspired, etc.) ·       payload_id            :                       Encoded payload category (e.g., 5_words, 10_words) ·       length:                              Payload length ·       height:                              Image dimensions      ·       width:                                Image dimensions      ·       retries:                               Number of embedding attempts      ·       status:                               Embedding success status The metadata files enable multi-task learning, including steganography detection, algorithm classification, and payload estimation. This structured annotation supports multi-task learning, including detection, classification, and payload estimation. 8. Embedding Strategies (Inspired Variants) The dataset uses inspired variants, not exact implementations of classical algorithms. This distinction is important and should be clearly stated. 8.1 LSB (Baseline) Pixel selection: Random Behavior: Uniform noise distribution Role: Baseline for comparison 8.2 WOW-inspired (Texture-Based) Pixel selection: High-gradient regions Behavior: Embedding in edges and textures Goal: Reduce visible distortion 8.3 HILL-inspired (Smooth-Based) Pixel selection: Low-gradient regions Behavior: Embedding in smooth areas Goal: Evaluate subtle modifications 8.4 S-UNIWARD-inspired (Uniform Distribution) Pixel selection: Even spatial spread Behavior: Balanced embedding Goal: Avoid spatial bias 8.5 HUGO-inspired (Controlled Random) Pixel selection: Random with spacing constraint Behavior: Non-clustered embedding Goal: Preserve statistical consistency   9. Embedding Mechanism Technique: Least Significant Bit (LSB) substitution Each selected pixel carries 1 bit of payload Embedding follows a sequential order over selected pixel positions
提供机构:
Zenodo
创建时间:
2026-05-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作