KakumaAerial Dataset
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14607338
下载链接
链接失效反馈官方服务:
资源简介:
Overview
The KakumaAerial dataset accompanies the paper "Mapping Refugee Camps with AI: A Benchmark Dataset and Baseline Models for Humanitarian Applications," submitted to the GeoCV Workshop on Computer Vision for Geospatial Image Analysis at WACV 2025. It features high-resolution aerial imagery paired with annotations for buildings and solar panels in the Kakuma and Kalobeyei refugee camps in Turkana County, Kenya.
The dataset supports key use cases in refugee camp contexts, including:
Mapping building and solar panel footprints, which can help inform spatial planning, population estimation, and energy access.
Classifying roof materials of non-toilet buildings to evaluate shelter quality and durability.
Identifying toilets to assess access to sanitation facilities and public health conditions.
This dataset is designed to advance humanitarian mapping by engaging the geospatial computer vision community with benchmark data tailored to the challenges of refugee camp environments, such as irregular layouts, diverse structures, and varying materials.Code to support this project is available in the associated GitHub repository: https://github.com/USAFORUNHCRhive/turkana-camp-roof-mapping
Citation
If you use this dataset, please cite the accompanying paper:
A. Gupta, A. Ortiz, S. F. Nsutezo, D. Kebut, S. Iyer, R. Dodhia, and J. M. Lavista Ferres, "Mapping Refugee Camps with AI: A Benchmark Dataset and Baseline Models for Humanitarian Applications," in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, Tucson, AZ, USA, Mar. 2025.
BibTeX:
@InProceedings{Gupta_2025_WACV,
author = {Gupta, Amrita and Ortiz, Anthony and Nsutezo, Simone Fobi and Kebut, Duncan and Iyer, Seema and Dodhia, Rahul and Lavista Ferres, Juan M.},
title = {Mapping Refugee Camps with AI: A Benchmark Dataset and Baseline Models for Humanitarian Applications},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops},
month = {March},
year = {2025},
}
Dataset Contents
The KakumaAerial dataset includes the following components
Tiled Imagery and Masks
High-resolution aerial imagery (tiled_images.zip): 500m x 500m tiles as 4-channel GeoTIFFs (RGB + NoData mask) derived from processed OpenAerialMap scenes.
Semantic segmentation masks: Single-channel GeoTIFFs derived from processed OpenStreetMap data and with the following semantic classes:
tiled_masks-background-building-solar.zip: 0 = Unlabeled, 1 = Background, 2 = Building, 3 = Solar Panel
tiled_masks-background-building_boundary-building-solar.zip: 0 = Unlabeled, 1 = Background, 2 = Building Boundary, 3 = Building, 4 = Solar Panel
tiled_masks-background-building-building_boundary-solar.zip: 0 = Unlabeled, 1 = Background, 2 = Building, 3 = Building Boundary, 4 = Solar Panel
Dataset splits file (tiled_dataset_splits.yml): A YAML file specifying train, validation, and test tile assignments.
Pre-Chipped Datasets
Semantic segmentation chips: 256x256 pixel chips derived from the tiled imagery and the three mask configurations described above:
chipped_segmentation-background-building-solar.zip
chipped_segmentation-background-building_boundary-building-solar.zip
chipped_segmentation-background-building-building_boundary-solar.zip
Roof material classification chips (chipped_roof_material_classification.zip): RGB chips with a fourth binary channel containing the footprint of a specific building, for which the roof material (metal, thatch, plastic, or other) is labeled.
Toilet classification chips (chipped_toilet_classification.zip): RGB chips with a fourth binary channel containing the footprint of a specific building, for which the binary label indicates whether the building is a toilet.
Vector Files (Optional Components)
Manually corrected OSM-derived annotations (osm-20241001-transformed-buildings_toilets_solar.geojson): Labeled polygons for buildings and solar panels, derived from OpenStreetMap data, with manual adjustments for improved alignment with the imagery and additional geometry transformations (e.g., converting points to polygons for toilets).
Manually drawn background polygons (manual-background-polygons.geojson): Polygons representing non-building areas.
Background points for hard negatives (hard-negative-background-points.shp): Points used to sample chips for creating hard negative examples for semantic segmentation.
创建时间:
2025-01-23



