electricsheepafrica/african-police-crime-statistics
收藏Hugging Face2026-03-20 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/african-police-crime-statistics
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- tabular-classification
- tabular-regression
language:
- en
tags:
- governance
- policing
- crime-statistics
- sub-saharan-africa
- synthetic
- security
- public-safety
- clearance-rates
- lmic
pretty_name: African Police & Crime Statistics
size_categories:
- 10K<n<100K
configs:
- config_name: baseline
data_files: data/baseline.csv
default: true
- config_name: reform_improved
data_files: data/reform_improved.csv
- config_name: deteriorated
data_files: data/deteriorated.csv
---
# African Police & Crime Statistics
## Abstract
A synthetic dataset modeling police capacity and crime statistics across 12 sub-Saharan African countries (2018–2025), parameterized from SAPS crime reports, Afrobarometer surveys, and security sector studies. Contains 10,000 records per scenario across three policing scenarios (baseline, reform_improved, deteriorated), with 19 variables covering police-to-population ratios, crime rates by type, clearance rates, investigation times, public trust, corruption perception, response times, and policing effectiveness classifications. Designed for ML classification, anomaly detection, and security sector governance research.
## 1. Introduction
Policing across sub-Saharan Africa faces profound challenges. Despite a much higher violent crime rate, South Africa has fewer police per 100,000 people than the US, UK, France, Italy, and China. Nigeria needs approximately 156,000 additional officers to meet commonly cited benchmarks. Kenya's police-to-population ratio improved from 1:978 in 2009 to 1:688 in 2021, but remains well below effective coverage.
Afrobarometer's 39-country survey (2021-2023) reveals that fewer than half (46%) of citizens trust the police "somewhat" or "a lot," and only 37% believe their government is doing well at reducing crime. South Africa recorded a crime index of 75.4 in 2024 (5th most dangerous globally), with contact crimes increasing 4.6% in early 2024. Sub-Saharan Africa's homicide rate averaged 13.9 per 100,000 in 2021.
This dataset fills a gap: no equivalent ML-ready dataset on HuggingFace exists for police capacity and crime statistics in Africa, despite strong demand from security analysts, DFIs, police reform researchers, and governance monitoring organizations.
## 2. Methodology
### 2.1 Target Population
Subnational (region-level) police and crime records for 12 sub-Saharan African countries spanning 2018–2025, across four region types and seven crime types.
**Countries included:** South Africa, Nigeria, Kenya, Ghana, Tanzania, Uganda, Rwanda, Ethiopia, Senegal, DRC, Mozambique, Botswana.
### 2.2 Parameterization Evidence Table
| Parameter | Value Used | Source | Year | Note |
|-----------|-----------|--------|------|------|
| SA crime index | 75.4 | Statista | 2024 | 5th most dangerous globally |
| SA contact crimes increase | +4.6% (Q4 2023/24) | SAPS via PMG | 2024 | 171,707 reported |
| SSA homicide rate | 13.9 per 100k | Macrotrends | 2021 | 5.3% increase from 2020 |
| Nigeria police ratio | 187 per 100k | Nexus Engineering | 2025 | ~156,000 officers deficit |
| Kenya police ratio | 1:688 (145 per 100k) | Africa Check | 2021 | Improved from 1:978 (2009) |
| Police trust SSA | 46% trust police | Afrobarometer PP90 | 2024 | 39-country survey |
| Government crime reduction approval | 37% | Afrobarometer PP90 | 2024 | Ranges 10% (Sudan) to 77% (Benin) |
| SA stations meeting UN ratio | 16.1% | DA parliamentary reply | 2019 | Only 1:220 ratio cited |
### 2.3 Scenario Design
| Scenario | Description | Crime Mult | Clearance Mult | Trust Mult |
|----------|-------------|------------|----------------|------------|
| **baseline** | Current SSA policing landscape (2018–2025) | 1.0× | 1.0× | 1.0× |
| **reform_improved** | Police reform with better training and community policing | 0.75× | 1.4× | 1.3× |
| **deteriorated** | Weak institutions, high corruption, under-resourced police | 1.35× | 0.6× | 0.7× |
## 3. Dataset Description
### 3.1 Schema
| Column | Type | Units | Range | Description |
|--------|------|-------|-------|-------------|
| record_id | int | — | 1–10,000 | Unique record identifier |
| country | categorical | — | 12 countries | Sub-Saharan African country |
| year | int | year | 2018–2025 | Observation year |
| region_type | categorical | — | 4 types | urban, peri_urban, rural, remote_rural |
| crime_type | categorical | — | 7 types | murder, assault, robbery, burglary, theft, vehicle_theft, sexual_offences |
| population_millions | float | millions | varies | Population |
| police_officers | int | count | varies | Number of police officers |
| police_per_100k | float | ratio | 30–350 | Officers per 100,000 population |
| crime_rate_per_100k | float | ratio | 0.1–500 | Reported crimes per 100,000 |
| reported_crimes | int | count | varies | Total reported crimes |
| clearance_rate | float | ratio | 0.05–0.95 | Cleared cases / reported crimes |
| cleared_cases | int | count | varies | Cleared cases |
| investigation_days | int | days | 7–200 | Average investigation duration |
| trust_police_score | float | score | 0.1–0.95 | Public trust in police |
| corruption_perception_index | float | score | 0.1–0.9 | Perceived corruption |
| response_time_minutes | int | minutes | 5–200 | Average police response time |
| community_engagement_score | float | score | 0.1–0.95 | Community policing engagement |
| resolution_quality_score | float | score | 0.1–0.95 | Case resolution quality |
| policing_effectiveness | categorical | — | 4 levels | effective (≥0.55), moderate (0.35–0.55), limited (0.20–0.35), ineffective (<0.20) |
### 3.2 Summary Statistics (baseline)
| Variable | Mean | SD | Min | Max |
|----------|------|-----|-----|-----|
| police_per_100k | 153.9 | 55.2 | 30 | 350 |
| clearance_rate | 0.378 | 0.142 | 0.05 | 0.95 |
| trust_police_score | 0.494 | 0.121 | 0.10 | 0.95 |
| response_time_minutes | 38 | 28 | 5 | 200 |
| corruption_perception_index | 0.52 | 0.15 | 0.10 | 0.90 |
## 4. Usage
### 4.1 Loading with HuggingFace datasets
```python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/african-police-crime-statistics")
ds_reform = load_dataset("electricsheepafrica/african-police-crime-statistics", "reform_improved")
```
### 4.2 Regenerating
```bash
pip install numpy pandas scipy matplotlib
python generate_dataset.py --scenario baseline --n 10000 --seed 42
python validate_dataset.py
```
## 5. Limitations & Ethical Considerations
1. **Synthetic data**: Not suitable for operational policing decisions or crime mapping.
2. **Underreporting**: Actual crime rates are significantly higher than reported statistics.
3. **Crime type aggregation**: Seven broad categories; actual crime classification is more nuanced.
4. **No victim demographics**: Age, gender, and socioeconomic status of victims not modeled.
5. **Police misconduct excluded**: Use of force, extrajudicial actions not captured.
## 6. References
1. South African Government, *Crime Statistics 2024/2025 Q4*, 2025.
2. SAPS, *Police Recorded Crime Statistics 2024/2025*.
3. Macrotrends, *Sub-Saharan Africa Crime Rate & Statistics*.
4. Afrobarometer, *Law Enforcers or Law Breakers? PP90*, 2024.
5. The Conversation, *Africa-wide survey of police*, 2024.
6. Africa Check, *Police-to-population ratio analysis*.
7. GroundUp, *How SA police force compares to other countries*, 2025.
8. Nexus Engineering, *Nigeria's Policing Crisis analysis*, 2025.
9. Crimehub, *Guide to understanding SA crime statistics*, 2024.
## Citation
```bibtex
@dataset{esa_police_crime_2026,
title={African Police & Crime Statistics},
author={{Electric Sheep Africa}},
year={2026},
publisher={HuggingFace},
url={https://huggingface.co/datasets/electricsheepafrica/african-police-crime-statistics},
license={CC-BY-4.0}
}
```
## License
CC-BY-4.0
提供机构:
electricsheepafrica



