electricsheepafrica/african-mobile-subscriber-data
收藏Hugging Face2026-03-21 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/african-mobile-subscriber-data
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
tags:
- telecom
- mobile
- subscribers
- sub-saharan-africa
- synthetic
- gsma
language:
- en
pretty_name: African Mobile Subscriber Data
size_categories:
- 10K<n<100K
task_categories:
- tabular-regression
- time-series-forecasting
---
# African Mobile Subscriber Data
Synthetic dataset of mobile telecommunications subscriber metrics across 15 Sub-Saharan African countries, 3 scenarios, and multiple operators.
## Dataset Summary
- **30,000 records** (10,000 per scenario)
- **15 countries**: Nigeria, South Africa, Kenya, Ghana, Tanzania, Ethiopia, Uganda, Cote d'Ivoire, Senegal, Angola, Mozambique, Rwanda, Cameroon, Madagascar, Zambia
- **3 scenarios**: `baseline`, `5g_rollout`, `market_saturation`
- **~50 operators** with country-specific profiles (MTN, Vodacom, Safaricom, Glo, Airtel, Orange, etc.)
## Variables
| Variable | Description | Unit |
|---|---|---|
| `record_id` | Unique record identifier | integer |
| `country` | Country name | string |
| `year` | Reporting year (2024–2026) | integer |
| `quarter` | Reporting quarter | Q1–Q4 |
| `operator` | Mobile network operator name | string |
| `technology` | Network generation | 2g/3g/4g/5g |
| `active_subscribers_millions` | Active subscriber count | millions |
| `arpu_usd` | Average revenue per user | USD/month |
| `monthly_churn_pct` | Monthly subscriber churn rate | % |
| `prepaid_share_pct` | Share of prepaid subscribers | % |
| `data_revenue_share_pct` | Data revenue as share of total | % |
| `voice_revenue_share_pct` | Voice revenue as share of total | % |
| `mobile_money_subscribers_millions` | Mobile money subscribers | millions |
| `data_usage_gb_per_user` | Monthly data consumption | GB/user |
| `network_coverage_pct` | Population coverage | % |
| `market_share_pct` | Operator market share | % |
| `spectrum_efficiency_index` | Spectral efficiency score | index |
| `customer_satisfaction_score` | Customer satisfaction rating | 1–5 |
| `scenario` | Simulation scenario | string |
## Scenarios
- **baseline**: Current trajectory extrapolation
- **5g_rollout**: Accelerated 5G deployment with reduced subscriber counts but higher data usage and spectrum efficiency
- **market_saturation**: Mature market with higher churn, increased subscriber penetration, and shifted revenue mix toward data
## Key Context
- **Nigeria** (~220M total subscribers): MTN Nigeria, Glo, Airtel Africa, 9mobile
- **South Africa** (~100M total SIMs): MTN SA, Vodacom, Cell C, Telkom Mobile
- **Kenya**: Safaricom dominance (~63% market share) with M-Pesa mobile money leadership (~82% penetration)
- Mobile money penetration varies significantly by country (e.g., Kenya 82% vs South Africa 8%)
## Usage
```python
import pandas as pd
df = pd.read_csv("data/african_mobile_subscribers.csv")
print(df.describe())
# Filter by scenario
baseline = df[df["scenario"] == "baseline"]
nigeria_5g = df[(df["country"] == "Nigeria") & (df["scenario"] == "5g_rollout")]
```
## Generation
```bash
python generate_dataset.py
python validate_dataset.py
```
## License
CC BY 4.0 — This is a synthetic dataset for research and educational purposes. It does not represent actual operator data.
### 数据集元数据
许可证:CC BY 4.0
标签:
- 电信
- 移动通信
- 用户
- 撒哈拉以南非洲
- 合成数据集
- GSMA(全球移动通信系统协会)
语言:
- 英语
数据集展示名:非洲移动通信用户数据集
数据规模类别:
- 10,000 < 数据量 < 100,000
任务类别:
- 表格回归(tabular-regression)
- 时间序列预测(time-series-forecasting)
# 非洲移动通信用户数据集
本数据集为合成数据集,涵盖15个撒哈拉以南非洲国家的移动通信用户运营指标,包含3种模拟场景与多家运营商数据。
## 数据集概览
- 共30,000条记录(每个场景10,000条)
- 覆盖15个国家:尼日利亚、南非、肯尼亚、加纳、坦桑尼亚、埃塞俄比亚、乌干达、科特迪瓦、塞内加尔、安哥拉、莫桑比克、卢旺达、喀麦隆、马达加斯加、赞比亚
- 包含3种模拟场景:`baseline`(基准轨迹推演场景)、`5g_rollout`(5G加速部署场景)、`market_saturation`(市场饱和场景)
- 约50家具备各国专属运营画像的运营商(如MTN、沃达丰(Vodacom)、萨法利通信(Safaricom)、Glo、Airtel、Orange等)
## 变量说明
| 变量名 | 描述 | 单位 |
|---|---|---|
| `record_id` | 唯一记录标识符 | 整数 |
| `country` | 国家名称 | 字符串 |
| `year` | 报告年度(2024–2026) | 整数 |
| `quarter` | 报告季度 | Q1–Q4 |
| `operator` | 移动网络运营商名称 | 字符串 |
| `technology` | 网络代际 | 2G/3G/4G/5G |
| `active_subscribers_millions` | 活跃用户数 | 百万 |
| `arpu_usd` | 每用户平均收入(ARPU) | 美元/月 |
| `monthly_churn_pct` | 月度用户流失率 | % |
| `prepaid_share_pct` | 预付费用户占比 | % |
| `data_revenue_share_pct` | 数据业务收入占总营收比例 | % |
| `voice_revenue_share_pct` | 语音业务收入占总营收比例 | % |
| `mobile_money_subscribers_millions` | 移动货币业务用户数 | 百万 |
| `data_usage_gb_per_user` | 月度数据消费量 | GB/用户 |
| `network_coverage_pct` | 人口覆盖率 | % |
| `market_share_pct` | 运营商市场份额 | % |
| `spectrum_efficiency_index` | 频谱效率指数 | 指数值 |
| `customer_satisfaction_score` | 客户满意度评分 | 1–5 |
| `scenario` | 模拟场景 | 字符串 |
## 模拟场景说明
- **基准场景(`baseline`)**:基于当前发展轨迹的推演场景
- **5G部署场景(`5g_rollout`)**:加速5G网络部署,用户规模有所缩减,但数据消费量与频谱效率均有所提升
- **市场饱和场景(`market_saturation`)**:成熟市场环境,用户流失率更高,用户渗透率提升,营收结构向数据业务倾斜
## 关键背景
- **尼日利亚**(总用户约2.2亿):运营商标识包括MTN尼日利亚、Glo、Airtel Africa、9mobile
- **南非**(总SIM卡数量约1亿):运营商标识包括MTN南非、沃达丰(Vodacom)、Cell C、Telkom Mobile
- **肯尼亚**:萨法利通信(Safaricom)占据市场主导地位(市场份额约63%),其M-Pesa移动货币业务渗透率约82%,处于行业领先水平
- 各国移动货币业务渗透率差异显著(例如肯尼亚82% vs 南非8%)
## 使用示例
python
import pandas as pd
df = pd.read_csv("data/african_mobile_subscribers.csv")
print(df.describe())
# 按场景筛选数据
baseline_data = df[df["scenario"] == "baseline"]
nigeria_5g_data = df[(df["country"] == "Nigeria") & (df["scenario"] == "5g_rollout")]
## 数据集生成与验证
bash
python generate_dataset.py
python validate_dataset.py
## 许可证
CC BY 4.0 — 本数据集为合成数据集,仅用于科研与教育用途,不代表任何实际运营商的真实运营数据。
提供机构:
electricsheepafrica



