electricsheepafrica/africa-the-kenya-2014-adult-hiv-prevalence-rate-by-county
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-the-kenya-2014-adult-hiv-prevalence-rate-by-county
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- no-annotation
language_creators:
- found
language:
- en
license: other
multilinguality:
- monolingual
size_categories:
- n<1K
source_datasets:
- original
task_categories:
- tabular-classification
- tabular-regression
task_ids: []
tags:
- africa
- humanitarian
- hdx
- electric-sheep-africa
- disease
- governance-and-civil-society
- health
- ken
pretty_name: "Kenya - Adult HIV prevalence rate by County"
dataset_info:
splits:
- name: train
num_examples: 37
- name: test
num_examples: 9
---
# Kenya - Adult HIV prevalence rate by County
**Publisher:** Kenya Open Data Initiative (inactive) · **Source:** [HDX](https://data.humdata.org/dataset/the-kenya-2014-adult-hiv-prevalence-rate-by-county) · **License:** `other-pd-nr` · **Updated:** 2023-03-03
---
## Abstract
Based on 2014 Kenya's HIV/AIDS profile data by County
This assesses the HIV/AIDS situation in Kenya's 47 counties with regards to adults and children living with the disease, new infections, gender specific infection rates, homes with orphans and their financial situation.
Each row in this dataset represents geolocated point observations. Data was last updated on HDX on 2023-03-03. Geographic scope: **KEN**.
*Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).*
---
## Dataset Characteristics
| | |
|---|---|
| **Domain** | Public health |
| **Unit of observation** | Geolocated point observations |
| **Rows (total)** | 47 |
| **Columns** | 29 (15 numeric, 12 categorical, 0 datetime) |
| **Train split** | 37 rows |
| **Test split** | 9 rows |
| **Geographic scope** | KEN |
| **Publisher** | Kenya Open Data Initiative (inactive) |
| **HDX last updated** | 2023-03-03 |
---
## Variables
**Geographic** — `county_name` (SIAYA, BUNGOMA, WEST POKOT), `total_population` (range 115520.0–3781394.0), `poe_medical_ward` (2%, 5%, 4%), `kes_cash_transfer_beneficiary_poor_households_with_an_orphan` (range 557.0–8107.0), `aids_related_deaths_15` (range 50.0–3579.0) and 2 others.
**Temporal** — `updated_at`.
**Demographic** — `art_coverage` (82%, 38%, 97%), `no_of_households_with_an_orphan` (range 2380.0–69730.0), `poor_households_with_an_orphan` (range 1166.0–34168.0).
**Identifier / Metadata** — `cartodb_id` (range 1.0–47.0), `esa_source`, `esa_processed`.
**Other** — `adult_15_hiv_prevalence` (range 0.2–25.7), `new_hiv_infections_adults_15` (range 18.0–12279.0), `new_hiv_infections_children_0_14` (range 2.0–2700.0), `hiv_adults` (range 307.0–140629.0), `hiv_prevalence_men` (3.4%, 2.3%, 3.3%) and 10 others.
---
## Quick Start
```python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-the-kenya-2014-adult-hiv-prevalence-rate-by-county")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
```
---
## Schema
| Column | Type | Null % | Range / Sample Values |
|---|---|---|---|
| `county_name` | object | 0.0% | SIAYA, BUNGOMA, WEST POKOT |
| `adult_15_hiv_prevalence` | float64 | 0.0% | 0.2 – 25.7 (mean 5.7128) |
| `new_hiv_infections_adults_15` | int64 | 0.0% | 18.0 – 12279.0 (mean 1885.7872) |
| `new_hiv_infections_children_0_14` | int64 | 0.0% | 2.0 – 2700.0 (mean 272.9149) |
| `art_coverage` | object | 0.0% | 82%, 38%, 97% |
| `hiv_adults` | int64 | 0.0% | 307.0 – 140629.0 (mean 28633.6596) |
| `total_population` | float64 | 2.1% | 115520.0 – 3781394.0 (mean 904730.6957) |
| `hiv_prevalence_men` | object | 2.1% | 3.4%, 2.3%, 3.3% |
| `hiv_prevalence_women` | object | 0.0% | 4.5%, 5.3%, 6.8% |
| `poe_prevention_of_mother_to_child_transmission` | object | 0.0% | 38%, 39%, 43% |
| `poe_voluntering_and_testing` | object | 0.0% | 58%, 59%, 110% |
| `poe_tuberculosis` | object | 0.0% | 1%, 3%, 2% |
| `poe_medical_ward` | object | 0.0% | 2%, 5%, 4% |
| `poe_overral` | object | 0.0% | 0%, 36%, 28% |
| `adults_in_need_of_art` | int64 | 0.0% | 250.0 – 102103.0 (mean 16189.0) |
| `adults_receiving_art` | int64 | 0.0% | 66.0 – 93714.0 (mean 12621.8298) |
| `children_in_need_of_art` | int64 | 0.0% | 114.0 – 15235.0 (mean 3020.4894) |
| `children_receiving_art` | int64 | 0.0% | 5.0 – 6988.0 (mean 1279.5957) |
| `no_of_households_with_an_orphan` | int64 | 0.0% | 2380.0 – 69730.0 (mean 23036.3617) |
| `poor_households_with_an_orphan` | int64 | 0.0% | 1166.0 – 34168.0 (mean 11288.383) |
| `kes_cash_transfer_beneficiary_poor_households_with_an_orphan` | int64 | 0.0% | 557.0 – 8107.0 (mean 3236.1915) |
| `aids_related_deaths_15` | float64 | 2.1% | 50.0 – 3579.0 (mean 1041.913) |
| `aids_related_deaths_0_14` | int64 | 0.0% | 9.0 – 1234.0 (mean 221.2979) |
| `coordinates` | object | 0.0% | (-0.062129888622, 34.247641704500), (0.749285239620, 34.640460875600), (1.740106171290, 35.243846809200) |
| `cartodb_id` | int64 | 0.0% | 1.0 – 47.0 (mean 24.0) |
| `created_at` | datetime64[ns, UTC] | 0.0% | |
| `updated_at` | datetime64[ns, UTC] | 0.0% | |
| `esa_source` | object | 0.0% | |
| `esa_processed` | object | 0.0% | |
---
## Numeric Summary
| Column | Min | Max | Mean | Median |
|---|---|---|---|---|
| `adult_15_hiv_prevalence` | 0.2 | 25.7 | 5.7128 | 4.3 |
| `new_hiv_infections_adults_15` | 18.0 | 12279.0 | 1885.7872 | 988.0 |
| `new_hiv_infections_children_0_14` | 2.0 | 2700.0 | 272.9149 | 60.0 |
| `hiv_adults` | 307.0 | 140629.0 | 28633.6596 | 18923.0 |
| `total_population` | 115520.0 | 3781394.0 | 904730.6957 | 856303.0 |
| `adults_in_need_of_art` | 250.0 | 102103.0 | 16189.0 | 10586.0 |
| `adults_receiving_art` | 66.0 | 93714.0 | 12621.8298 | 6507.0 |
| `children_in_need_of_art` | 114.0 | 15235.0 | 3020.4894 | 2058.0 |
| `children_receiving_art` | 5.0 | 6988.0 | 1279.5957 | 725.0 |
| `no_of_households_with_an_orphan` | 2380.0 | 69730.0 | 23036.3617 | 18492.0 |
| `poor_households_with_an_orphan` | 1166.0 | 34168.0 | 11288.383 | 9061.0 |
| `kes_cash_transfer_beneficiary_poor_households_with_an_orphan` | 557.0 | 8107.0 | 3236.1915 | 2474.0 |
| `aids_related_deaths_15` | 50.0 | 3579.0 | 1041.913 | 747.5 |
| `aids_related_deaths_0_14` | 9.0 | 1234.0 | 221.2979 | 132.0 |
| `cartodb_id` | 1.0 | 47.0 | 24.0 | 24.0 |
---
## Curation
Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 1 column(s) with >80% missing values were removed: `the_geom`. 2 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet.
---
## Limitations
- Data originates from Kenya Open Data Initiative (inactive) and has not been independently validated by ESA.
- Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection.
- Refer to the [original HDX dataset page](https://data.humdata.org/dataset/the-kenya-2014-adult-hiv-prevalence-rate-by-county) for the publisher's own methodology notes and caveats.
---
## Citation
```bibtex
@dataset{hdx_africa_the_kenya_2014_adult_hiv_prevalence_rate_by_county,
title = {Kenya - Adult HIV prevalence rate by County},
author = {Kenya Open Data Initiative (inactive)},
year = {2023},
url = {https://data.humdata.org/dataset/the-kenya-2014-adult-hiv-prevalence-rate-by-county},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
```
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica



