five

electricsheepafrica/africa-education-namibia

收藏
Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-education-namibia
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - no-annotation language_creators: - found language: - en license: cc-by-sa-4.0 multilinguality: - monolingual size_categories: - 1K<n<10K source_datasets: - original task_categories: - tabular-classification - tabular-regression - other task_ids: [] tags: - africa - humanitarian - hdx - electric-sheep-africa - education - health-facilities - transportation - nam pretty_name: "Namibia - Accessibility Indicators" dataset_info: splits: - name: train num_examples: 1872 - name: test num_examples: 468 --- # Namibia - Accessibility Indicators **Publisher:** HeiGIT (Heidelberg Institute for Geoinformation Technology) · **Source:** [HDX](https://data.humdata.org/dataset/namibia-accessibility-indicators) · **License:** `cc-by-sa` · **Updated:** 2026-02-27 --- ## Abstract This dataset provides insights into spatial accessibility to healthcare and education services across Namibia. It has been created using free and open tools such as [openrouteservice](https://openrouteservice.org/) and open data sources, primarily [OpenStreetMap](https://www.openstreetmap.org/) (OSM). To assess accessibility to education and healthcare, we use travel-time isochrones—polygons representing areas reachable within a given time or distance by car. We overlay these isochrones with [WorldPop](https://www.worldpop.org/) population data, which provides 100m-resolution estimates. This allows us to calculate the population within time intervals from 10 to 120 minutes away from hospital services and distance intervals from 5 to 50 km away from schools. The unit of analysis is defined by [geoboundaries](https://www.geoboundaries.org/) country borders, and where available we also summarise results at finer administrative levels (ADM 1–4). Data Structure: - **name**: Region or country name. - **iso**: ISO3 country code. - **id**: Unique identifier for the administrative unit. - **country**: ISO3 country code. - **admin_level**: Administrative level of the unit. - **category**: Service category — `education`, `hospitals` or `primary_healthcare`. - **range_type**: Method used for the catchment zone — `distance` or `time`. - **range**: Distance (in meters) or Time away (in seconds) from schools used to generate the polygon. - **population**: Total population within the specified range. - **school_age_population**: Number of school-age individuals within the range. - **school_age_population_share**: Cumulative percentage of school-age population. - **school_age_population_interval**: Incremental school-age population added in the current distance band. - **school_age_population_interval_share**: Proportion of new school-age population in the current interval. - **population_share**: Cumulative percentage of total population. - **population_interval**: Incremental population added in the current distance band. - **population_interval_share**: Share of the total population represented by the current interval. This dataset is one of many [HeiGIT exports on HDX](https://data.humdata.org/organization/heidelberg-institute-for-geoinformation-technology). See the [HeiGIT](https://heigit.org/) website for more information. We are looking forward to hearing about your use-case! Feel free to reach out to us and tell us about your research at [communications@heigit.org](mailto:communications@heigit.org) – we would be happy to amplify your work. References: - [Geldsetzer, P., Reinmuth, M., Ouma, P. O., Lautenbach, S. et al. (2020)](https://www.thelancet.com/journals/lanhl/article/PIIS2666-7568(20)30010-6/fulltext) - [Petricola, S., Reinmuth, M., Lautenbach, S. et al. (2022)](https://ij-healthgeographics.biomedcentral.com/articles/10.1186/s12942-022-00315-2) - [Klipper, I. G., Zipf, A., and Lautenbach, S. (2021)](https://agile-giss.copernicus.org/articles/2/4/2021/) - [Ruiz Sánchez, R., Reinmuth, M., Albornoz, C., Lautenbach, S., and Zipf, A. (2025)](https://agile-giss.copernicus.org/articles/6/10/2025/) Further Information: - [Open Access Lens](https://giscience.github.io/open-access-lens/#/) **Limitations**: * **OSM Completeness**: This analysis relies on OpenStreetMap (OSM) data. While OSM is the most complete open map of the world, data quality varies significantly by region. In areas with unmapped roads or facilities, accessibility may be underestimated. * **Population Estimates**: Population counts are derived from WorldPop top-down estimates (constrained). These are statistical models based on census projections and satellite imagery, not direct census counts, and may contain inaccuracies at the local pixel level. * **Travel Time Assumptions**: Isochrones are calculated using standard vehicle speeds for different road types. These models do not account for real-time traffic, seasonal weather conditions (e.g., flooding), or road surface degradation. * **Boundary Precision**: Administrative boundaries are sourced from geoBoundaries. These may differ slightly from official government demarcations or other schemas. Each row in this dataset represents country-level aggregates. Data was last updated on HDX on 2026-02-27. Geographic scope: **NAM**. *Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).* --- ## Dataset Characteristics | | | |---|---| | **Domain** | Public health | | **Unit of observation** | Country-level aggregates | | **Rows (total)** | 2,340 | | **Columns** | 14 (5 numeric, 9 categorical, 0 datetime) | | **Train split** | 1,872 rows | | **Test split** | 468 rows | | **Geographic scope** | NAM | | **Publisher** | HeiGIT (Heidelberg Institute for Geoinformation Technology) | | **HDX last updated** | 2026-02-27 | --- ## Variables **Geographic** — `country` (NAM), `admin_level` (ADM2, ADM1, ADM0), `category` (education), `range_type` (DISTANCE), `population_type` (school_age, total) and 4 others. **Identifier / Metadata** — `name` (Luderitz, Karas, Kalahari), `id` (8085530B86358564716630, 8085530B72153402463843, 8085530B71951838310116), `esa_source` (HDX), `esa_processed` (2026-04-27). **Other** — `range` (range 5000.0–50000.0). --- ## Quick Start ```python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-education-namibia") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() ``` --- ## Schema | Column | Type | Null % | Range / Sample Values | |---|---|---|---| | `name` | object | 0.0% | Luderitz, Karas, Kalahari | | `id` | object | 0.0% | 8085530B86358564716630, 8085530B72153402463843, 8085530B71951838310116 | | `country` | object | 0.0% | NAM | | `admin_level` | object | 0.0% | ADM2, ADM1, ADM0 | | `category` | object | 0.0% | education | | `range_type` | object | 0.0% | DISTANCE | | `range` | int64 | 0.0% | 5000.0 – 50000.0 (mean 27500.0) | | `population_type` | object | 0.0% | school_age, total | | `population` | int64 | 0.0% | 22.0 – 1900195.0 (mean 29338.6064) | | `population_share` | float64 | 0.0% | 0.89 – 100.0 (mean 55.5838) | | `population_interval` | int64 | 0.0% | 0.0 – 1368977.0 (mean 3451.8329) | | `population_interval_share` | float64 | 0.0% | 0.0 – 100.0 (mean 6.7706) | | `esa_source` | object | 0.0% | HDX | | `esa_processed` | object | 0.0% | 2026-04-27 | --- ## Numeric Summary | Column | Min | Max | Mean | Median | |---|---|---|---|---| | `range` | 5000.0 | 50000.0 | 27500.0 | 27500.0 | | `population` | 22.0 | 1900195.0 | 29338.6064 | 7310.0 | | `population_share` | 0.89 | 100.0 | 55.5838 | 51.63 | | `population_interval` | 0.0 | 1368977.0 | 3451.8329 | 124.0 | | `population_interval_share` | 0.0 | 100.0 | 6.7706 | 0.87 | --- ## Curation Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 1 column(s) with >80% missing values were removed: `iso`. The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet. --- ## Limitations - Data originates from HeiGIT (Heidelberg Institute for Geoinformation Technology) and has not been independently validated by ESA. - Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection. - Refer to the [original HDX dataset page](https://data.humdata.org/dataset/namibia-accessibility-indicators) for the publisher's own methodology notes and caveats. --- ## Citation ```bibtex @dataset{hdx_africa_education_namibia, title = {Namibia - Accessibility Indicators}, author = {HeiGIT (Heidelberg Institute for Geoinformation Technology)}, year = {2026}, url = {https://data.humdata.org/dataset/namibia-accessibility-indicators}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } ``` --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作