IsuruDiIshan/amazon-sentinel2-forest-segmentation
收藏Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/IsuruDiIshan/amazon-sentinel2-forest-segmentation
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- no-annotation
language: en
license: cc-by-4.0
multilingual: false
size_categories: n<1000
source_datasets:
- original
---
# Amazon Sentinel-2 Forest Segmentation Dataset
## Dataset Description
This dataset contains satellite images from the **Amazon biome** for semantic segmentation of forested areas. The images were extracted from **Sentinel-2 Level 2A** satellite imagery and converted to GeoTIFF format to preserve all four spectral bands.
### Source
- **Original Source**: [Zenodo](https://doi.org/10.5281/zenodo.4498086)
- **Paper**: Bragagnolo, L., da Silva, R.V., & Grzybowski, J.M.V. (2021). Amazon and Atlantic Forest image datasets for semantic segmentation. *Zenodo*. https://doi.org/10.5281/zenodo.4498086
## Dataset Structure
| Split | Images | Masks | Description |
|---------|--------|-------|---------------------------------------|
| train | 499 | 499 | Training samples with pixel masks |
| val | 100 | 100 | Validation samples with pixel masks |
| test | 20 | - | Test samples (no masks provided) |
| **Total** | **619** | **599** | |
### Image Properties
| Property | Value |
|----------------|----------------------------------------|
| Format | GeoTIFF (.tif) |
| Size | 512 × 512 pixels |
| Data type | 8-bit unsigned integer (0-255) |
| Spectral bands | 4 (R, G, B, NIR) |
### Spectral Bands
| Band | Sentinel-2 Band | Wavelength | Description |
|------|-----------------|------------|-----------------|
| 0 | B4 | 664.5 nm | Red |
| 1 | B3 | 559 nm | Green |
| 2 | B2 | 492.4 nm | Blue |
| 3 | B8 | 832.8 nm | Near-Infrared |
### Label Encoding
| Value | Class | Description |
|-------|------------|--------------------------------|
| 0 | Background | Non-forested areas (soil, water, urban) |
| 1 | Forest | Forested areas |
## Loading the Dataset
### Using HuggingFace Datasets (Recommended)
```python
from datasets import load_dataset
# Load from HuggingFace Hub
dataset = load_dataset("NickBurns/amazon-sentinel2-forest-segmentation")
# Access splits
train_ds = dataset["train"]
val_ds = dataset["val"]
test_ds = dataset["test"]
# Example: access a single sample
sample = train_ds[0]
image = sample["image"] # shape: (4, 512, 512) - [R, G, B, NIR]
label = sample["label"] # shape: (512, 512) - binary mask
filename = sample["filename"]
```
### Using Rasterio (Manual Loading)
```python
import rasterio
import numpy as np
from pathlib import Path
def load_sample(image_path, label_path=None):
"""Load a single image and optional mask."""
with rasterio.open(image_path) as src:
image = src.read() # shape: (4, H, W)
label = None
if label_path and Path(label_path).exists():
with rasterio.open(label_path) as src:
label = src.read() # shape: (H, W)
return image, label
```
### Using torchgeo
```python
from torchgeo.datasets import RasterDataset
from torch.utils.data import DataLoader
# Note: Requires separate handling for multi-band GeoTIFF
class Sentinel2Dataset(RasterDataset):
filename_glob = "*.tif"
is_image = True
ds = Sentinel2Dataset("path/to/train/image/")
dl = DataLoader(ds, batch_size=4)
```
## Example Sample
```python
from datasets import load_dataset
ds = load_dataset("NickBurns/amazon-sentinel2-forest-segmentation", split="train")
sample = ds[0]
print(f"Image shape: {sample['image'].shape}") # (4, 512, 512)
print(f"Label shape: {sample['label'].shape}") # (512, 512)
print(f"Unique labels: {np.unique(sample['label'])}") # [0, 1]
print(f"Filename: {sample['filename']}")
```
## Dataset Statistics
### Class Distribution (Training Set)
Based on the original Zenodo publication, the dataset was curated to include diverse forest and non-forest coverage for semantic segmentation training.
### Geographic Coverage
- **Region**: Amazon Biome, Brazil
- **Satellite**: Sentinel-2A
- **Acquisition**: 2020
## License
[Creative Commons Attribution 4.0 International (CC-BY 4.0)](https://creativecommons.org/licenses/by/4.0/)
## Citation
```bibtex
@misc{bragagnolo2021amazon,
title = {Amazon and Atlantic Forest image datasets for semantic segmentation},
author = {Bragagnolo, Lucimara and da Silva, Roberto Valmir and Grzybowski, José Mario Vicensi},
year = {2021},
publisher = {Zenodo},
doi = {10.5281/zenodo.4498086},
url = {https://doi.org/10.5281/zenodo.4498086}
}
```
## Acknowledgments
Original dataset created by researchers at the Federal University of Fronteira Sul, Brazil. Converted and uploaded to HuggingFace for easier access and integration with machine learning workflows.
提供机构:
IsuruDiIshan



