five

electricsheepafrica/africa-demographics-botswana

收藏
Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-demographics-botswana
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - no-annotation language_creators: - found language: - en license: cc-by-sa-4.0 multilinguality: - monolingual size_categories: - n<1K source_datasets: - original task_categories: - tabular-classification - other task_ids: [] tags: - africa - humanitarian - hdx - electric-sheep-africa - affected-population - demographics - flooding - hazards-and-risk - health-facilities - indicators - bwa pretty_name: "Botswana - Risk Assessment Indicators" dataset_info: splits: - name: train num_examples: 22 - name: test num_examples: 5 --- # Botswana - Risk Assessment Indicators **Publisher:** HeiGIT (Heidelberg Institute for Geoinformation Technology) · **Source:** [HDX](https://data.humdata.org/dataset/botswana---risk-assessment-indicators) · **License:** `cc-by-sa` · **Updated:** 2026-04-13 --- ## Abstract This dataset provides comprehensive **Risk Assessment Indicators** for **Botswana**, aggregated at **admin level 2** and can in particular be used to perform a structured risk assessment for **flood** hazards. It includes demographic, environmental, infrastructure, accessibility, and hazard-related data to support disaster risk and resilience analysis. All layers are derived from [HeiGIT’s GAIA Pipeline](https://giscience.github.io/gis-training-resource-center/content/GIS_AA/en_gaia_indicators_processing.html), integrating open data sources such as [WorldPop](https://www.worldpop.org/), [OpenStreetMap](https://www.openstreetmap.org/), and [Google Earth Engine](https://earthengine.google.com/) based on [HDX COD-AB](https://data.humdata.org/dataset/?q=cod-ab) boundaries. --- ### **Data Overview** - **Access to Services (`BWA_ADM2_access`)** - **Facilities (`BWA_ADM2_facilities`)** - **Coping Capacity (`BWA_ADM2_coping`)** - **Demographics (`BWA_ADM2_demographics`)** - **Rural Population (`BWA_ADM2_rural_population`)** - **Vulnerability (`BWA_ADM2_vulnerability`)** - **Flood Exposure (`BWA_ADM2_flood_exposure`)** <p>&nbsp;</p> <p>&nbsp;</p> --- ### **Indicator Descriptions** #### **Access to Services (`BWA_ADM2_access`)** Represents the share of the population with access to key facilities within defined distances or travel times. - **ADM2_PCODE** – Administrative division code (ADM2) - **access_pop_education_5km / 10km / 20km** – Population within 5, 10, and 20 km of educational facilities - **access_pop_hospitals_30min / 1h / 2h** – Population within 30 minutes, 1 hour, and 2 hours of a hospital - **access_pop_primary_healthcare_30min / 1h / 2h** – Population within 30 minutes, 1 hour, and 2 hours of a primary health care facility Data Source: [openrouteservice (ORS)](https://openrouteservice.org/) --- #### **Facilities (`BWA_ADM2_facilities`)** Counts of essential service facilities within each district. - **ADM2_PCODE** – Administrative division code (ADM2) - **education_count** – Number of educational facilities - **hospitals_count** – Number of hospitals - **primary_healthcare_count** – Number of primary health care facilities Data Source: [OpenStreetMap (OSM)](https://www.openstreetmap.org) --- #### **Coping Capacity (`BWA_ADM2_coping`)** Combines **Access to Services** and **Facilities** data to represent a district’s coping capacity. --- #### **Demographics (`BWA_ADM2_demographics`)** Shows the population composition by age and gender. - **ADM2_PCODE** – Administrative division code (ADM2) - **female_pop** – Total female population - **children_u5** – Population under 5 years old - **female_u5** – Female population under 5 years old - **elderly** – Population aged 65 and older - **pop_u15** – Population under 15 years old - **female_u15** – Female population under 15 years old Data Source: [Worldpop](https://www.worldpop.org/) --- #### **Rural Population (`BWA_ADM2_rural_population`)** Same demographic breakdown as above, but limited to rural populations. Rural areas are those outside urban extents, typically characterized by lower population density, agricultural or natural land use, and limited infrastructure compared to urban centers. - **ADM2_PCODE** – Administrative division code (ADM2) - **female_pop_rural**, **children_u5_rural**, **female_u5_rural**, **elderly_rural**, **pop_u15_rural**, **female_u15_rural** – Rural demographic counts - **rural_pop_perc** – Percentage of total population living in rural areas Data Source: [Global Human Settlement Layer (GHSL)](https://human-settlement.emergency.copernicus.eu/datasets.php) --- #### **Vulnerability (`BWA_ADM2_vulnerability`)** Combines **Demographics** and **Rural Population** indicators. --- #### **Flood Exposure (`BWA_ADM2_flood_exposure`)** Shows population and facility exposure to flooding at 30 cm depth for multiple return periods. - **ADM2_PCODE** – Administrative division code (ADM2) - **female_pop_30cm**, **children_u5_30cm**, **female_u5_30cm**, **elderly_30cm**, **pop_u15_30cm**, **female_u15_30cm** – Exposed population by group - **education_30cm_pct / count**, **hospitals_30cm_pct / count**, **primary_healthcare_30cm_pct / count** – Facility exposure (percentage and count) Data Source: [The Joint Research Centre (JRC)](https://data.jrc.ec.europa.eu/collection/id-0054) --- ### **QGIS Plugin Risk Assessment Inputs** - **Coping Capacity** = Access + Facilities - **Vulnerability** = Demographics + Rural Population - **Exposure** = Vulnerable Population + Facilities exposed to Floods This dataset is part of HeiGIT’s **Risk Assessment Indicator Collection** on HDX. See more at [HeiGIT on HDX](https://data.humdata.org/organization/heidelberg-institute-for-geoinformation-technology) and learn about HeiGIT’s research at [HeiGIT](https://heigit.org/). We are happy to hear about your use-cases — contact us at [communications@heigit.org](mailto:communications@heigit.org)! Each row in this dataset represents tabular records. Data was last updated on HDX on 2026-04-13. Geographic scope: **BWA**. *Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).* --- ## Dataset Characteristics | | | |---|---| | **Domain** | Public health | | **Unit of observation** | Tabular records | | **Rows (total)** | 28 | | **Columns** | 16 (12 numeric, 4 categorical, 0 datetime) | | **Train split** | 22 rows | | **Test split** | 5 rows | | **Geographic scope** | BWA | | **Publisher** | HeiGIT (Heidelberg Institute for Geoinformation Technology) | | **HDX last updated** | 2026-04-13 | --- ## Variables **Geographic** — `access_pop_primary_healthcare_30min` (range 0.0–316769.0), `access_pop_primary_healthcare_1h` (range 0.0–370952.0), `access_pop_primary_healthcare_2h` (range 0.0–430441.0), `primary_healthcare_count` (range 0.0–19.0). **Demographic** — `access_pop_education_5km` (range 0.0–207652.0), `access_pop_education_10km` (range 0.0–299581.0), `access_pop_education_20km` (range 0.0–375153.0), `access_pop_hospitals_30min` (range 0.0–307541.0), `access_pop_hospitals_1h` (range 0.0–369679.0) and 1 others. **Outcome / Measurement** — `education_count` (range 0.0–65.0), `hospitals_count` (range 0.0–8.0). **Identifier / Metadata** — `adm2_pcode` (BW0101, BW0201, BW1701), `adm_pcode` (BW0101, BW0201, BW1701), `esa_source` (HDX), `esa_processed` (2026-04-27). --- ## Quick Start ```python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-demographics-botswana") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() ``` --- ## Schema | Column | Type | Null % | Range / Sample Values | |---|---|---|---| | `adm2_pcode` | object | 0.0% | BW0101, BW0201, BW1701 | | `access_pop_education_5km` | int64 | 0.0% | 0.0 – 207652.0 (mean 57413.8929) | | `access_pop_education_10km` | int64 | 0.0% | 0.0 – 299581.0 (mean 66284.5357) | | `access_pop_education_20km` | int64 | 0.0% | 0.0 – 375153.0 (mean 77781.9643) | | `access_pop_hospitals_30min` | int64 | 0.0% | 0.0 – 307541.0 (mean 61899.5357) | | `access_pop_hospitals_1h` | int64 | 0.0% | 0.0 – 369679.0 (mean 77866.75) | | `access_pop_hospitals_2h` | int64 | 0.0% | 0.0 – 423111.0 (mean 89251.3929) | | `access_pop_primary_healthcare_30min` | int64 | 0.0% | 0.0 – 316769.0 (mean 62616.3214) | | `access_pop_primary_healthcare_1h` | int64 | 0.0% | 0.0 – 370952.0 (mean 77571.1071) | | `access_pop_primary_healthcare_2h` | int64 | 0.0% | 0.0 – 430441.0 (mean 89437.6071) | | `education_count` | int64 | 0.0% | 0.0 – 65.0 (mean 18.9286) | | `hospitals_count` | int64 | 0.0% | 0.0 – 8.0 (mean 2.25) | | `primary_healthcare_count` | int64 | 0.0% | 0.0 – 19.0 (mean 4.8214) | | `adm_pcode` | object | 0.0% | BW0101, BW0201, BW1701 | | `esa_source` | object | 0.0% | HDX | | `esa_processed` | object | 0.0% | 2026-04-27 | --- ## Numeric Summary | Column | Min | Max | Mean | Median | |---|---|---|---|---| | `access_pop_education_5km` | 0.0 | 207652.0 | 57413.8929 | 41402.0 | | `access_pop_education_10km` | 0.0 | 299581.0 | 66284.5357 | 49796.0 | | `access_pop_education_20km` | 0.0 | 375153.0 | 77781.9643 | 57905.0 | | `access_pop_hospitals_30min` | 0.0 | 307541.0 | 61899.5357 | 43500.5 | | `access_pop_hospitals_1h` | 0.0 | 369679.0 | 77866.75 | 57407.0 | | `access_pop_hospitals_2h` | 0.0 | 423111.0 | 89251.3929 | 69382.0 | | `access_pop_primary_healthcare_30min` | 0.0 | 316769.0 | 62616.3214 | 36101.0 | | `access_pop_primary_healthcare_1h` | 0.0 | 370952.0 | 77571.1071 | 57354.0 | | `access_pop_primary_healthcare_2h` | 0.0 | 430441.0 | 89437.6071 | 66069.0 | | `education_count` | 0.0 | 65.0 | 18.9286 | 16.0 | | `hospitals_count` | 0.0 | 8.0 | 2.25 | 2.0 | | `primary_healthcare_count` | 0.0 | 19.0 | 4.8214 | 3.0 | --- ## Curation Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet. --- ## Limitations - Data originates from HeiGIT (Heidelberg Institute for Geoinformation Technology) and has not been independently validated by ESA. - Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection. - Refer to the [original HDX dataset page](https://data.humdata.org/dataset/botswana---risk-assessment-indicators) for the publisher's own methodology notes and caveats. --- ## Citation ```bibtex @dataset{hdx_africa_demographics_botswana, title = {Botswana - Risk Assessment Indicators}, author = {HeiGIT (Heidelberg Institute for Geoinformation Technology)}, year = {2026}, url = {https://data.humdata.org/dataset/botswana---risk-assessment-indicators}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } ``` --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作