Generating aerial flood prediction imagery

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/13366121

下载链接

链接失效反馈

官方服务：

资源简介：

Flood Generation Dataset Dataset Overview The dataset contains paired sets of pre- and post-flooding satellite images, which can be used for training models on flood prediction tasks, particularly on generating photorealistic visualisations of floods. Each image is also associated with topographical factors, which can be used to assist the model in making predictions. These factors are a digital elevation model (DEM), flow accumulation, distance to rivers, and cartographical map. Every image set was manually verified and adjusted, in order to ensure perfect pixel alignment, where each pixel represents the same 0.5m x 0.5m geographical area in all of the factors. The dataset contains 1229 unique image sets, 235 of Hurricane Florence, 335 of Hurricane Harvey, 147 of the Midwest floods, and 512 of the India monsoon. 408 of the image sets (from Hurricane Harvey and some of the Midwest floods) have an additional alternative version with a 1m/pixel resolution DEM. Dataset Structure The dataset input folder contains images with the pre-flooding satellite image and topographical factors represented in 9 stacked channels (see the table below) and stored as a .tif file. The dataset output folder contains the post-flooding satellite image, stored as a .tif file. The name of the file contains which disaster it represents. Channel Contents 0, 1, 2 Pre-flooding satellite image 3 DEM 4 Flow accumulation 5 Distance to rivers 6, 7, 8 Cartographical map Dataset Contents Pre- and post satellite images. Each image has 1024x1024 pixels in a three-band RGB format, with a resolution of approximately 0.5 metres/pixel. The satellite images were extracted from the xBD dataset, captured by the Maxar Open Data Program. It was released under the Creative Commons Attribution-Noncommercial-Sharealike 4.0 International licence (CC BY-NC-SA 4.0). Digital elevation model and flow accumulation The USGS 3D Elevation Program DEM, which has a 1/3 arc-second (10 metre) resolution, describes the topography of the images within the USA. The elevations in this DEM represent the topographic bare-earth surface. The Copernicus GLO-30 DEM, which has a resolution of only 30 metres, was used for the flood areas in India. The flow accumulation was calculated from the DEMs. The USGS 3D Elevation Program DEM is released by the U.S. Geological Survey, 2023, 1/3rd arc-second Digital Elevation Models (DEMs) - USGS National Map 3DEP Downloadable Data Collection. All 3DEP products are public domain. The Copernicus Global Digital Elevation model was produced using Copernicus WorldDEM-30 DLR e.V. 2010-2014 and Airbus Defence and Space GmbH 2014-2018, provided under COPERNICUS by the European Union and ESA. Distance to rivers and cartographical map OpenStreetMap data was downloaded from Planet OSM and processed using the Osmium tool. The Maperitive software was then used to apply a custom ruleset to the maps' appearance, removing all of the text, and enhancing the clarity of the land use types. The distance to rivers representation was producing by creating buffer distances (at 0.5km intervals) to all major rivers and waterways, as classified by the Open Street Map. The OpenStreetMap (OSM) data is distributed under the Open Database License (ODbL). https://www.openstreetmap.org/copyright Flood Segmentation Dataset Dataset Overview In order to evalate the flood predictive accuracy of post-flooding images generated by a model that has been trained on the flood generation dataset, the synthetic and ground truth post-flooding images can be compared using a flood segmentation model trained on this dataset. A post-flooding satellite image is input into the the segmentation model, and it then classifies whether each pixel is flooded or not, and outputs a 1-channel binary flood mask, where white (1) corresponds to a predicted flooded pixel, and black (0) corresponds to a predicted non-flooded pixel. The flood masks corresponding to both real and synthetic images can then be compared using metrics such as MSE, precision, recall, etc, thus determining whether the generative model correctly flooded the same areas that are actually flooded in the real ground truth image. Dataset Structure The flood segmentation dataset contains 1028 images, depicting the same disasters as in the flood generation dataset. The masks input folder contains 256x256 pixel 3-channel RGB post-flooding .tif images, which are comprised of a mix of ground truth and synthetic images generated by different GAN architectures. The masks output folder contains the corresponding 1 channel binary flood masks. 332 of the flood masks were published openly by: https://huggingface.co/datasets/blutjens/eie-earth-intelligence-engine. The rest of the masks were manually labelled by myself.

创建时间：

2024-08-27

5,000+

优质数据集

54 个

任务类型

进入经典数据集