five

tsilva/mnist-gaussian-noisy

收藏
Hugging Face2026-03-11 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/tsilva/mnist-gaussian-noisy
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: image dtype: image - name: noise dtype: array2_d: shape: - 28 - 28 dtype: float32 - name: raw_image dtype: image - name: label dtype: class_label: names: '0': '0' '1': '1' '2': '2' '3': '3' '4': '4' '5': '5' '6': '6' '7': '7' '8': '8' '9': '9' - name: source_index dtype: int32 - name: replica_index dtype: int16 - name: noise_variance dtype: float32 splits: - name: train num_bytes: 1268081241 num_examples: 300000 - name: test num_bytes: 188565919 num_examples: 44600 download_size: 1460541650 dataset_size: 1456647160 configs: - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* --- # tsilva/mnist-gaussian-noisy ## Dataset Summary This dataset expands MNIST by creating multiple Gaussian-noisy variants of each original example. Each row is structured for direct supervised training: the input is a noisy image and the target is the original sampled Gaussian noise map, with the clean image kept as a reference column. Noise is sampled from a zero-mean normal distribution on normalized pixel values in `[0, 1]`, added to the clean image, clipped back to `[0, 1]`, and converted to 8-bit grayscale. The `noise` column stores the original sampled Gaussian draw before clipping. ## Columns - `image`: the noisy 28x28 grayscale input image used as the model source - `noise`: the 28x28 float Gaussian noise sample in normalized pixel space - `raw_image`: the clean 28x28 grayscale reference image - `label`: the original digit class from `0` to `9` - `source_index`: the original example index inside the source MNIST split - `replica_index`: which noisy replica this row corresponds to for the clean source image - `noise_variance`: the Gaussian variance used to sample the stored noise map ## Splits - `train`: 300,000 image pairs - `test`: 44,600 image pairs, balanced to `4,460` pairs per class ## Noise Configuration - Source dataset: MNIST - Noisy counterparts per source example: `5` - Variances: `0.0100, 0.0325, 0.0550, 0.0775, 0.1000` - Random seed: `42` - Test balancing: exact class balance via downsampling the MNIST test split to the minimum class count ## Intended Use This dataset is intended for experiments where each training row should already contain a noisy source image and the original noise sample used to corrupt it. It is suited for noise prediction and generative or iterative denoising setups that operate directly on sampled noise fields. ## Load Example ```python from datasets import load_dataset ds = load_dataset("tsilva/mnist-gaussian-noisy") sample = ds["train"][0] print(sample["image"]) print(sample["noise"][0][0]) print(sample["raw_image"]) print(sample["noise_variance"]) ```
提供机构:
tsilva
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作