Hermanni/sen12mscr
收藏Hugging Face2026-03-31 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Hermanni/sen12mscr
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- image-to-image
tags:
- remote-sensing
- cloud-removal
- SAR
- sentinel
pretty_name: SEN12MS-CR
citation: |
@article{meraner2020cloud,
title={Cloud removal in Sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion},
author={Meraner, Andrea and Ebel, Patrick and Zhu, Xiao Xiang and Schmitt, Michael},
journal={ISPRS Journal of Photogrammetry and Remote Sensing},
volume={166},
pages={333--346},
year={2020}
}
size_categories:
- 100K<n<1M
configs:
- config_name: default
data_files:
- split: train
path:
- spring/scene_1.parquet
- spring/scene_6.parquet
- spring/scene_8.parquet
- spring/scene_9.parquet
- spring/scene_15.parquet
- spring/scene_21.parquet
- spring/scene_26.parquet
- spring/scene_39.parquet
- spring/scene_40.parquet
- spring/scene_45.parquet
- spring/scene_58.parquet
- spring/scene_63.parquet
- spring/scene_66.parquet
- spring/scene_75.parquet
- spring/scene_77.parquet
- spring/scene_97.parquet
- spring/scene_100.parquet
- spring/scene_101.parquet
- spring/scene_109.parquet
- spring/scene_110.parquet
- spring/scene_113.parquet
- spring/scene_115.parquet
- spring/scene_117.parquet
- spring/scene_119.parquet
- spring/scene_120.parquet
- spring/scene_121.parquet
- spring/scene_124.parquet
- spring/scene_126.parquet
- spring/scene_128.parquet
- spring/scene_132.parquet
- spring/scene_134.parquet
- spring/scene_141.parquet
- spring/scene_142.parquet
- spring/scene_145.parquet
- spring/scene_147.parquet
- summer/scene_4.parquet
- summer/scene_7.parquet
- summer/scene_11.parquet
- summer/scene_15.parquet
- summer/scene_25.parquet
- summer/scene_27.parquet
- summer/scene_31.parquet
- summer/scene_36.parquet
- summer/scene_40.parquet
- summer/scene_42.parquet
- summer/scene_43.parquet
- summer/scene_47.parquet
- summer/scene_55.parquet
- summer/scene_56.parquet
- summer/scene_72.parquet
- summer/scene_76.parquet
- summer/scene_86.parquet
- summer/scene_87.parquet
- summer/scene_89.parquet
- summer/scene_90.parquet
- summer/scene_93.parquet
- summer/scene_95.parquet
- summer/scene_100.parquet
- summer/scene_101.parquet
- summer/scene_102.parquet
- summer/scene_113.parquet
- summer/scene_114.parquet
- summer/scene_115.parquet
- summer/scene_120.parquet
- summer/scene_121.parquet
- summer/scene_123.parquet
- summer/scene_124.parquet
- summer/scene_125.parquet
- summer/scene_126.parquet
- summer/scene_132.parquet
- summer/scene_133.parquet
- summer/scene_135.parquet
- summer/scene_137.parquet
- summer/scene_139.parquet
- summer/scene_140.parquet
- summer/scene_143.parquet
- summer/scene_146.parquet
- summer/scene_147.parquet
- fall/scene_1.parquet
- fall/scene_3.parquet
- fall/scene_4.parquet
- fall/scene_6.parquet
- fall/scene_11.parquet
- fall/scene_14.parquet
- fall/scene_19.parquet
- fall/scene_22.parquet
- fall/scene_26.parquet
- fall/scene_27.parquet
- fall/scene_28.parquet
- fall/scene_30.parquet
- fall/scene_31.parquet
- fall/scene_33.parquet
- fall/scene_35.parquet
- fall/scene_37.parquet
- fall/scene_39.parquet
- fall/scene_40.parquet
- fall/scene_41.parquet
- fall/scene_42.parquet
- fall/scene_57.parquet
- fall/scene_64.parquet
- fall/scene_71.parquet
- fall/scene_77.parquet
- fall/scene_81.parquet
- fall/scene_82.parquet
- fall/scene_83.parquet
- fall/scene_85.parquet
- fall/scene_88.parquet
- fall/scene_91.parquet
- fall/scene_93.parquet
- fall/scene_100.parquet
- fall/scene_104.parquet
- fall/scene_105.parquet
- fall/scene_107.parquet
- fall/scene_109.parquet
- fall/scene_110.parquet
- fall/scene_112.parquet
- fall/scene_114.parquet
- fall/scene_116.parquet
- fall/scene_119.parquet
- fall/scene_120.parquet
- fall/scene_122.parquet
- fall/scene_125.parquet
- fall/scene_128.parquet
- fall/scene_131.parquet
- fall/scene_133.parquet
- fall/scene_134.parquet
- fall/scene_135.parquet
- fall/scene_136.parquet
- fall/scene_141.parquet
- fall/scene_142.parquet
- fall/scene_144.parquet
- fall/scene_147.parquet
- fall/scene_148.parquet
- fall/scene_149.parquet
- winter/scene_8.parquet
- winter/scene_21.parquet
- winter/scene_25.parquet
- winter/scene_42.parquet
- winter/scene_47.parquet
- winter/scene_49.parquet
- winter/scene_55.parquet
- winter/scene_59.parquet
- winter/scene_61.parquet
- winter/scene_62.parquet
- winter/scene_64.parquet
- winter/scene_68.parquet
- winter/scene_75.parquet
- winter/scene_81.parquet
- winter/scene_94.parquet
- winter/scene_102.parquet
- winter/scene_104.parquet
- winter/scene_112.parquet
- winter/scene_116.parquet
- winter/scene_135.parquet
- winter/scene_146.parquet
- split: validation
path:
- spring/scene_17.parquet
- summer/scene_17.parquet
- summer/scene_19.parquet
- summer/scene_80.parquet
- summer/scene_127.parquet
- fall/scene_65.parquet
- winter/scene_22.parquet
- winter/scene_84.parquet
- winter/scene_107.parquet
- winter/scene_130.parquet
- split: test
path:
- spring/scene_31.parquet
- spring/scene_44.parquet
- spring/scene_106.parquet
- spring/scene_123.parquet
- spring/scene_140.parquet
- summer/scene_73.parquet
- summer/scene_119.parquet
- fall/scene_139.parquet
- winter/scene_63.parquet
- winter/scene_108.parquet
---
# SEN12MS-CR
Reorganized mirror of the [SEN12MS-CR dataset](https://mediatum.ub.tum.de/1554803) in Parquet format.
## Quick Start
```python
from datasets import load_dataset
import numpy as np
ds = load_dataset("Hermanni/sen12mscr", streaming=True)
for sample in ds["train"]:
sar = np.frombuffer(sample["sar"], dtype=np.float32).reshape(sample["sar_shape"])
cloudy = np.frombuffer(sample["cloudy"], dtype=np.int16).reshape(sample["opt_shape"])
target = np.frombuffer(sample["target"], dtype=np.int16).reshape(sample["opt_shape"])
# Optical tensors are stored as HWC: (256, 256, 13)
# Convert to CHW if needed:
# cloudy = np.transpose(cloudy, (2, 0, 1))
# target = np.transpose(target, (2, 0, 1))
break
```
## Notes
- sar is stored as float32
- cloudy and target are stored as int16
- opt_shape is stored in HWC order, typically (256, 256, 13)
- The dtype column is a legacy field and should not be used for decoding cloudy or target
## Full Download
```python
ds = load_dataset("Hermanni/sen12mscr", split="train")
```
## PyTorch Example
```python
from torch.utils.data import Dataset, DataLoader
from datasets import load_dataset
import numpy as np
import torch
class SEN12MSCR(Dataset):
def __init__(self, hf_dataset, normalize=True, chw_optical=True):
self.ds = hf_dataset
self.normalize = normalize
self.chw_optical = chw_optical
def __len__(self):
return len(self.ds)
def __getitem__(self, idx):
s = self.ds[idx]
sar = np.frombuffer(s["sar"], dtype=np.float32).reshape(s["sar_shape"]).astype(np.float32)
cloudy = np.frombuffer(s["cloudy"], dtype=np.int16).reshape(s["opt_shape"]).astype(np.float32)
target = np.frombuffer(s["target"], dtype=np.int16).reshape(s["opt_shape"]).astype(np.float32)
if self.chw_optical:
cloudy = np.transpose(cloudy, (2, 0, 1))
target = np.transpose(target, (2, 0, 1))
sar = torch.from_numpy(sar.copy())
cloudy = torch.from_numpy(cloudy.copy())
target = torch.from_numpy(target.copy())
if self.normalize:
cloudy /= 10000.0
target /= 10000.0
return {"sar": sar, "cloudy": cloudy, "target": target}
ds = load_dataset("Hermanni/sen12mscr", split="train")
loader = DataLoader(SEN12MSCR(ds), batch_size=8, shuffle=True, num_workers=4)
```
## Contents
- ~122,218 triplets
- SAR: Sentinel-1, 2 channels, float32
- Cloudy: Sentinel-2, 13 channels, int16
- Target: Sentinel-2, 13 channels, int16
- 4 seasons, 175 global ROIs (2018)
## Columns
| Column | Type | Description |
|---|---|---|
| sar | binary | SAR bytes, decode as float32, reshape with sar_shape |
| cloudy | binary | Cloudy S2 bytes, decode as int16, reshape with opt_shape |
| target | binary | Cloud-free S2 bytes, decode as int16, reshape with opt_shape |
| sar_shape | list[int] | SAR shape, typically [2, 256, 256] |
| opt_shape | list[int] | Optical shape, typically [256, 256, 13] |
| dtype | string | Legacy field from SAR export; do not use for optical decoding |
| season | string | spring / summer / fall / winter |
| scene | string | Scene number |
| patch | string | Patch ID |
## License
CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
## Source
- mediaTUM (ID: 1554803) (https://mediatum.ub.tum.de/1554803)
提供机构:
Hermanni



