FireSR: A Dataset for Super-Resolution and Segmentation of Burned Areas
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11383985
下载链接
链接失效反馈官方服务:
资源简介:
# FireSR Dataset
## Overview
**FireSR** is a dataset designed for the super-resolution and segmentation of wildfire-burned areas. It includes data for all wildfire events in Canada from 2017 to 2023 that exceed 2000 hectares in size, as reported by the National Burned Area Composite (NBAC). The dataset aims to support high-resolution daily monitoring and improve wildfire management using machine learning techniques.
## Dataset Structure
The dataset is organized into several directories, each containing data relevant to different aspects of wildfire monitoring:
- **S2**: Contains Sentinel-2 images. - **pre**: Pre-fire Sentinel-2 images (high resolution). - **post**: Post-fire Sentinel-2 images (high resolution).
- **mask**: Contains NBAC polygons, which serve as ground truth masks for the burned areas. - **pre**: Burned area labels from the year before the fire, using the same spatial bounds as the fire events of the current year. - **post**: Burned area labels corresponding to post-fire conditions.
- **MODIS**: Contains post-fire MODIS images (lower resolution).
- **LULC**: Contains land use/land cover data from ESRI Sentinel-2 10-Meter Land Use/Land Cover (2017-2023).
- **Daymet**: Contains weather data from Daymet V4: Daily Surface Weather and Climatological Summaries.
### File Naming Convention
Each GeoTIFF (.tif) file is named according to the format: `CA___.tif`, where:- `CA` stands for Canada.- `` is the year of the wildfire event.- `` is the province code (e.g., AB for Alberta, BC for British Columbia).- `` is a unique identifier for the wildfire event.
### Directory Structure
The dataset is organized as follows:
```FireSR/│├── dataset/│ ├── S2/│ │ ├── post/│ │ │ ├── CA_2017_AB_204.tif│ │ │ ├── CA_2017_AB_2418.tif│ │ │ └── ...│ │ ├── pre/│ │ │ ├── CA_2017_AB_204.tif│ │ │ ├── CA_2017_AB_2418.tif│ │ │ └── ...│ ├── mask/│ │ ├── post/│ │ │ ├── CA_2017_AB_204.tif│ │ │ ├── CA_2017_AB_2418.tif│ │ │ └── ...│ │ ├── pre/│ │ │ ├── CA_2017_AB_204.tif│ │ │ ├── CA_2017_AB_2418.tif│ │ │ └── ...│ ├── MODIS/│ │ ├── CA_2017_AB_204.tif│ │ ├── CA_2017_AB_2418.tif│ │ └── ...│ ├── LULC/│ │ ├── CA_2017_AB_204.tif│ │ ├── CA_2017_AB_2418.tif│ │ └── ...│ ├── Daymet/│ │ ├── CA_2017_AB_204.tif│ │ ├── CA_2017_AB_2418.tif│ │ └── ...```
### Spatial Resolution and Channels
- **Sentinel-2 (S2) Images**: 20 meters (Bands: B12, B8, B4)- **MODIS Images**: 250 meters (Bands: B7, B2, B1)- **NBAC Burned Area Labels**: 20 meters (1 channel, binary classification: burned/unburned)- **Daymet Weather Data**: 1000 meters (7 channels: dayl, prcp, srad, swe, tmax, tmin, vp)- **ESRI Land Use/Land Cover Data**: 10 meters (1 channel with 9 classes: water, trees, flooded vegetation, crops, built area, bare ground, snow/ice, clouds, rangeland)
**Daymet Weather Data**: The Daymet dataset includes seven channels that provide various weather-related parameters, which are crucial for understanding and modeling wildfire conditions:
| Name | Units | Min | Max | Description |
|------|-------|-----|-----|-------------|
| dayl | seconds | 0 | 86400 | Duration of the daylight period, based on the period of the day during which the sun is above a hypothetical flat horizon. |
| prcp | mm | 0 | 544 | Daily total precipitation, sum of all forms converted to water-equivalent. |
| srad | W/m^2 | 0 | 1051 | Incident shortwave radiation flux density, averaged over the daylight period of the day. |
| swe | kg/m^2 | 0 | 13931 | Snow water equivalent, representing the amount of water contained within the snowpack. |
| tmax | °C | -60 | 60 | Daily maximum 2-meter air temperature. |
| tmin | °C | -60 | 42 | Daily minimum 2-meter air temperature. |
| vp | Pa | 0 | 8230 | Daily average partial pressure of water vapor. |
**ESRI Land Use/Land Cover Data**: The ESRI 10m Annual Land Cover dataset provides a time series of global maps of land use and land cover (LULC) from 2017 to 2023 at a 10-meter resolution. These maps are derived from ESA Sentinel-2 imagery and are generated by Impact Observatory using a deep learning model trained on billions of human-labeled pixels. Each map is a composite of LULC predictions for 9 classes throughout the year, offering a representative snapshot of each year.
| Class Value | Land Cover Class |
|-------------|------------------|
| 1 | Water |
| 2 | Trees |
| 4 | Flooded Vegetation |
| 5 | Crops |
| 7 | Built Area |
| 8 | Bare Ground |
| 9 | Snow/Ice |
| 10 | Clouds |
| 11 | Rangeland |
## Usage Tutorial
To help users get started with FireSR, we provide a comprehensive tutorial with scripts for data extraction and processing. Below is an example workflow:
### Step 1: Extract FireSR.tar.gz
```bashtar -xvf FireSR.tar.gz```
### Step 2: Tiling the GeoTIFF Files
The dataset contains high-resolution GeoTIFF files. For machine learning models, it may be useful to tile these images into smaller patches. Here's a Python script to tile the images:
```pythonimport rasteriofrom rasterio.windows import Windowimport os
def tile_image(image_path, output_dir, tile_size=128): with rasterio.open(image_path) as src: for i in range(0, src.height, tile_size): for j in range(0, src.width, tile_size): window = Window(j, i, tile_size, tile_size) transform = src.window_transform(window) outpath = os.path.join(output_dir, f"{os.path.basename(image_path).split('.')[0]}_{i}_{j}.tif") with rasterio.open(outpath, 'w', driver='GTiff', height=tile_size, width=tile_size, count=src.count, dtype=src.dtypes[0], crs=src.crs, transform=transform) as dst: dst.write(src.read(window=window))
# Example usagetile_image('FireSR/dataset/S2/post/CA_2017_AB_204.tif', 'tiled_images/')```
### Step 3: Loading Data into a Machine Learning Model
After tiling, the images can be loaded into a machine learning model using libraries like PyTorch or TensorFlow. Here's an example using PyTorch:
```pythonimport torchfrom torch.utils.data import Datasetfrom torchvision import transformsimport rasterio
class FireSRDataset(Dataset): def __init__(self, image_dir, transform=None): self.image_dir = image_dir self.transform = transform self.image_paths = [os.path.join(image_dir, f) for f in os.listdir(image_dir) if f.endswith('.tif')]
def __len__(self): return len(self.image_paths)
def __getitem__(self, idx): image_path = self.image_paths[idx] with rasterio.open(image_path) as src: image = src.read() if self.transform: image = self.transform(image) return image
# Example usagedataset = FireSRDataset('tiled_images/', transform=transforms.ToTensor())dataloader = torch.utils.data.DataLoader(dataset, batch_size=16, shuffle=True)```
## License
This dataset is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to share and adapt the material as long as appropriate credit is given.
## Contact
For any questions or further information, please contact:- Name: Eric Brune- Email: ebrune@kth.se
# FireSR数据集
## 概述
**FireSR**是专为野火过火区域超分辨率重建与语义分割任务设计的数据集,收录了加拿大国家过火区域综合数据集(National Burned Area Composite, NBAC)报告的2017至2023年间所有过火面积超过2000公顷的野火事件数据。本数据集旨在借助机器学习技术,支持高分辨率每日野火监测,并提升野火管理效能。
## 数据集结构
数据集按多个目录组织,每个目录包含野火监测不同维度的相关数据:
- **S2**:存储哨兵二号(Sentinel-2)影像。
- **pre**:野火前哨兵二号高分辨率影像。
- **post**:野火后哨兵二号高分辨率影像。
- **mask**:存储NBAC多边形数据,作为过火区域的真值掩码。
- **pre**:火灾前一年的过火区域标签,空间范围与当年野火事件一致。
- **post**:对应野火后状况的过火区域标签。
- **MODIS**:存储野火后中分辨率成像光谱仪(MODIS)低分辨率影像。
- **LULC**:存储ESRI哨兵二号10米土地利用/土地覆盖(2017-2023)数据。
- **Daymet**:存储Daymet V4数据集的每日地表气象与气候摘要数据。
### 文件命名规范
每个GeoTIFF(.tif)文件遵循`CA_<年份>_<省份代码>_<野火事件唯一标识符>.tif`的命名格式,其中:
- `CA`代表加拿大;
- `<年份>`为野火事件的发生年份;
- `<省份代码>`为省份编码(例如AB代表阿尔伯塔省,BC代表不列颠哥伦比亚省);
- `<野火事件唯一标识符>`为该野火事件的唯一标识。
### 目录结构
数据集组织形式如下:
FireSR/
│
├── dataset/
│ ├── S2/
│ │ ├── post/
│ │ │ ├── CA_2017_AB_204.tif
│ │ │ ├── CA_2017_AB_2418.tif
│ │ │ └── ...
│ │ ├── pre/
│ │ │ ├── CA_2017_AB_204.tif
│ │ │ ├── CA_2017_AB_2418.tif
│ │ │ └── ...
│ ├── mask/
│ │ ├── post/
│ │ │ ├── CA_2017_AB_204.tif
│ │ │ ├── CA_2017_AB_2418.tif
│ │ │ └── ...
│ │ ├── pre/
│ │ │ ├── CA_2017_AB_204.tif
│ │ │ ├── CA_2017_AB_2418.tif
│ │ │ └── ...
│ ├── MODIS/
│ │ ├── CA_2017_AB_204.tif
│ │ ├── CA_2017_AB_2418.tif
│ │ └── ...
│ ├── LULC/
│ │ ├── CA_2017_AB_204.tif
│ │ ├── CA_2017_AB_2418.tif
│ │ └── ...
│ ├── Daymet/
│ │ ├── CA_2017_AB_204.tif
│ │ ├── CA_2017_AB_2418.tif
│ │ └── ...
### 空间分辨率与通道信息
- **哨兵二号(Sentinel-2)影像**:分辨率20米,包含B12、B8、B4三个波段;
- **中分辨率成像光谱仪(MODIS)影像**:分辨率250米,包含B7、B2、B1三个波段;
- **NBAC过火区域标签**:分辨率20米,单通道二分类标签(过火/未过火);
- **Daymet气象数据**:分辨率1000米,包含7个通道:dayl、prcp、srad、swe、tmax、tmin、vp;
- **ESRI土地利用/土地覆盖数据**:分辨率10米,单通道9分类,包含水体、树木、淹没植被、耕地、建成区、裸地、积雪/冰盖、云、牧场。
**Daymet气象数据**:Daymet数据集包含7个气象参数通道,对理解与建模野火状况至关重要:
| 名称 | 单位 | 最小值 | 最大值 | 描述 |
|------|-------|-----|-----|-------------|
| dayl | 秒 | 0 | 86400 | 日照时长,指太阳高于假想平坦地平线的时段长度。 |
| prcp | 毫米(mm) | 0 | 544 | 日总降水量,为所有形式降水转化为水当量后的总和。 |
| srad | 瓦/平方米(W/m²) | 0 | 1051 | 入射短波辐射通量密度,为当日日照时段的平均值。 |
| swe | 千克/平方米(kg/m²) | 0 | 13931 | 雪水当量,指积雪层中所含的水量。 |
| tmax | 摄氏度(°C) | -60 | 60 | 每日2米高度处的最高气温。 |
| tmin | 摄氏度(°C) | -60 | 42 | 每日2米高度处的最低气温。 |
| vp | 帕斯卡(Pa) | 0 | 8230 | 每日平均水汽分压。 |
**ESRI土地利用/土地覆盖数据**:ESRI 10米年度土地覆盖数据集提供了2017至2023年全球10米分辨率土地利用/土地覆盖(Land Use/Land Cover, 以下简称LULC)时序地图。该数据集基于欧洲空间局(ESA)哨兵二号影像,由Impact Observatory使用在数十亿人工标注像素上训练的深度学习模型生成。每张地图为当年全年度LULC预测的合成结果,可反映该年度土地覆盖的典型状态。
| 类别值 | 土地覆盖类别 |
|-------------|------------------|
| 1 | 水体 |
| 2 | 树木 |
| 4 | 淹没植被 |
| 5 | 耕地 |
| 7 | 建成区 |
| 8 | 裸地 |
| 9 | 积雪/冰盖 |
| 10 | 云 |
| 11 | 牧场 |
## 使用教程
为帮助用户快速上手FireSR数据集,我们提供了包含数据提取与处理脚本的完整教程。以下为示例工作流:
### 步骤1:解压FireSR.tar.gz
bash
tar -xvf FireSR.tar.gz
### 步骤2:对GeoTIFF文件进行分块
本数据集包含高分辨率GeoTIFF文件,针对机器学习模型,可将影像切割为更小的图像块以适配模型输入。以下为用于分块的Python脚本:
python
import rasterio
from rasterio.windows import Window
import os
def tile_image(image_path, output_dir, tile_size=128):
with rasterio.open(image_path) as src:
for i in range(0, src.height, tile_size):
for j in range(0, src.width, tile_size):
window = Window(j, i, tile_size, tile_size)
transform = src.window_transform(window)
outpath = os.path.join(output_dir, f"{os.path.basename(image_path).split('.')[0]}_{i}_{j}.tif")
with rasterio.open(outpath, 'w', driver='GTiff', height=tile_size, width=tile_size, count=src.count, dtype=src.dtypes[0], crs=src.crs, transform=transform) as dst:
dst.write(src.read(window=window))
# 示例用法
tile_image('FireSR/dataset/S2/post/CA_2017_AB_204.tif', 'tiled_images/')
### 步骤3:将数据加载至机器学习模型
完成分块后,可使用PyTorch或TensorFlow等库将影像加载至机器学习模型。以下为使用PyTorch的示例代码:
python
import torch
from torch.utils.data import Dataset
from torchvision import transforms
import rasterio
class FireSRDataset(Dataset):
def __init__(self, image_dir, transform=None):
self.image_dir = image_dir
self.transform = transform
self.image_paths = [os.path.join(image_dir, f) for f in os.listdir(image_dir) if f.endswith('.tif')]
def __len__(self):
return len(self.image_paths)
def __getitem__(self, idx):
image_path = self.image_paths[idx]
with rasterio.open(image_path) as src:
image = src.read()
if self.transform:
image = self.transform(image)
return image
# 示例用法
dataset = FireSRDataset('tiled_images/', transform=transforms.ToTensor())
dataloader = torch.utils.data.DataLoader(dataset, batch_size=16, shuffle=True)
## 许可协议
本数据集采用知识共享署名4.0国际许可协议(Creative Commons Attribution 4.0 International License, CC BY 4.0)进行授权。您可自由共享与改编本数据集内容,但需注明适当的来源。
## 联系方式
如有任何疑问或需进一步信息,请联系:
- 姓名:Eric Brune
- 邮箱:ebrune@kth.se
创建时间:
2024-08-29



