five

electricsheepafrica/african-internet-penetration-district

收藏
Hugging Face2026-03-21 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/african-internet-penetration-district
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 tags: - digital - internet - connectivity - sub-saharan-africa - synthetic - telecom - infrastructure language: - en pretty_name: African Internet Penetration District Dataset size_categories: - 10K<n<100K task_categories: - tabular-regression - tabular-classification --- # African Internet Penetration District Dataset Synthetic district-level internet connectivity data for 12 Sub-Saharan African countries across three policy scenarios (2018–2025). ## Dataset Description This dataset provides granular, district-level estimates of internet penetration and digital infrastructure metrics across Sub-Saharan Africa. It is fully synthetic, generated with realistic statistical distributions calibrated against public data from ITU, GSMA, World Bank, and national regulatory authorities. ### Countries | Country | Baseline Avg Internet Pen | |---------|--------------------------| | South Africa | ~70% | | Nigeria | ~55% | | Kenya | ~65% | | DR Congo | ~25% | | Ghana | ~53% | | Tanzania | ~40% | | Ethiopia | ~20% | | Uganda | ~35% | | Rwanda | ~45% | | Senegal | ~42% | | Cote d'Ivoire | ~38% | | Mozambique | ~15% | ### Scenarios - **baseline**: Business-as-usual growth trajectory - **infrastructure_push**: Aggressive infrastructure investment (+12pp internet, +40% speed, -30% cost) - **digital_divide**: Widening digital gap (-8pp internet, -30% speed, +30% cost) ### Variables | Variable | Type | Description | |----------|------|-------------| | `record_id` | int | Unique record identifier | | `country` | str | Country name | | `district` | str | Province/state/district name | | `year` | int | Year (2018–2025) | | `population` | int | District population estimate | | `internet_penetration_pct` | float | Internet penetration rate (%) | | `mobile_broadband_pct` | float | Mobile broadband subscription rate (%) | | `fixed_broadband_pct` | float | Fixed broadband subscription rate (%) | | `urban_rural` | str | Urban or rural classification | | `avg_download_speed_mbps` | float | Average download speed (Mbps) | | `data_cost_per_gb_usd` | float | Mobile data cost per GB (USD) | | `smartphone_penetration_pct` | float | Smartphone ownership rate (%) | | `2g_coverage_pct` | float | 2G network coverage (%) | | `3g_coverage_pct` | float | 3G network coverage (%) | | `4g_coverage_pct` | float | 4G/LTE network coverage (%) | | `5g_coverage_pct` | float | 5G network coverage (%) | | `digital_literacy_score` | float | Digital literacy composite score (0–100) | | `affordability_index` | float | Affordability index (0–100) | | `connectivity_gap_index` | float | Connectivity gap index (0–100, higher = worse) | | `scenario` | str | Policy scenario label | ## Files - `data/african_internet_penetration_district.csv` — Combined dataset (all scenarios) - `data/african_internet_penetration_baseline.csv` — Baseline scenario - `data/african_internet_penetration_infrastructure_push.csv` — Infrastructure push scenario - `data/african_internet_penetration_digital_divide.csv` — Digital divide scenario - `data/african_internet_penetration_district.parquet` — Parquet format (all scenarios) - `generate_dataset.py` — Generation script - `validate_dataset.py` — Validation script ## Usage ```python import pandas as pd df = pd.read_csv("data/african_internet_penetration_district.csv") print(df.describe()) # Filter by country and scenario sa_baseline = df[(df["country"] == "South Africa") & (df["scenario"] == "baseline")] ``` ## Generation ```bash pip install -r requirements.txt python generate_dataset.py python validate_dataset.py ``` ## License Creative Commons Attribution 4.0 International (CC-BY-4.0).
提供机构:
electricsheepafrica
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作