electricsheepafrica/africa-financial-times-excess-mortality-during-covid-19-pandemic-data
收藏Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-financial-times-excess-mortality-during-covid-19-pandemic-data
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- no-annotation
language_creators:
- found
language:
- en
license: cc-by-sa-4.0
multilinguality:
- monolingual
size_categories:
- 100K<n<1M
source_datasets:
- original
task_categories:
- tabular-classification
task_ids: []
tags:
- africa
- humanitarian
- hdx
- electric-sheep-africa
- covid-19
- fatalities
- hxl
- aut
- bel
- bra
- chl
- dnk
pretty_name: "Financial Times - Excess mortality during COVID-19 pandemic"
dataset_info:
splits:
- name: train
num_examples: 87529
- name: test
num_examples: 21882
---
# Financial Times - Excess mortality during COVID-19 pandemic
**Publisher:** HDX · **Source:** [HDX](https://data.humdata.org/dataset/financial-times-excess-mortality-during-covid-19-pandemic-data) · **License:** `cc-by-sa` · **Updated:** 2026-01-30
---
## Abstract
This dataset contains excess mortality data for the period covering the 2020 Covid-19 pandemic.
The data contains the excess mortality data for all known jurisdictions which publish all-cause mortality data meeting the following criteria:
- daily, weekly or monthly level of granularity
- includes equivalent historical data for at least one full year before 2020, and preferably at least five years (2015-2019)
Most countries publish mortality data with a longer periodicity (typically quarterly or even annually), a longer publication lag time, or both. This sort of data is not suitable for ongoing analysis during an epidemic and is therefore not included here.
"Excess mortality" refers to the difference between deaths from all causes during the pandemic and the historic seasonal average. For many of the jurisdictions shown here, this figure is higher than the official Covid-19 fatalities that are published by national governments each day. While not all of these deaths are necessarily attributable to the disease, it does leave a number of unexplained deaths that suggests that the official figures of deaths attributed may significant undercounts of the pandemic's impact.
Each row in this dataset represents first-level administrative unit observations. Temporal coverage is indicated by the `date` column(s). Geographic scope: **AUT, BEL, BRA, CHL, DNK, ECU, FRA, DEU, and 16 others**.
*Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).*
---
## Dataset Characteristics
| | |
|---|---|
| **Domain** | Epidemiology and disease surveillance |
| **Unit of observation** | First-level administrative unit observations |
| **Rows (total)** | 109,412 |
| **Columns** | 11 (5 numeric, 5 categorical, 1 datetime) |
| **Train split** | 87,529 rows |
| **Test split** | 21,882 rows |
| **Geographic scope** | AUT, BEL, BRA, CHL, DNK, ECU, FRA, DEU, and 16 others |
| **Publisher** | HDX |
| **HDX last updated** | 2026-01-30 |
---
## Variables
**Geographic** — `country` (US, UK, Italy), `region` (North West, North East, South East), `year` (range 2000.0–2021.0), `total_excess_deaths_pct` (range -472.7804–528.3219).
**Temporal** — `period` (week, month), `month` (range 1.0–12.0), `week` (range 1.0–53.0), `date`.
**Outcome / Measurement** — `deaths` (range 1.0–243235.0).
**Identifier / Metadata** — `esa_source` (HDX), `esa_processed` (2026-04-09).
---
## Quick Start
```python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-financial-times-excess-mortality-during-covid-19-pandemic-data")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
```
---
## Schema
| Column | Type | Null % | Range / Sample Values |
|---|---|---|---|
| `country` | object | 0.0% | US, UK, Italy |
| `region` | object | 0.0% | North West, North East, South East |
| `period` | object | 2.1% | week, month |
| `year` | int64 | 0.0% | 2000.0 – 2021.0 (mean 2016.2524) |
| `month` | float64 | 30.0% | 1.0 – 12.0 (mean 6.3726) |
| `week` | float64 | 3.2% | 1.0 – 53.0 (mean 25.812) |
| `date` | datetime64[ns] | 0.0% | |
| `deaths` | float64 | 0.0% | 1.0 – 243235.0 (mean 1937.7782) |
| `total_excess_deaths_pct` | float64 | 1.9% | -472.7804 – 528.3219 (mean 24.0445) |
| `esa_source` | object | 0.0% | HDX |
| `esa_processed` | object | 0.0% | 2026-04-09 |
---
## Numeric Summary
| Column | Min | Max | Mean | Median |
|---|---|---|---|---|
| `year` | 2000.0 | 2021.0 | 2016.2524 | 2017.0 |
| `month` | 1.0 | 12.0 | 6.3726 | 6.0 |
| `week` | 1.0 | 53.0 | 25.812 | 25.0 |
| `deaths` | 1.0 | 243235.0 | 1937.7782 | 645.0 |
| `total_excess_deaths_pct` | -472.7804 | 528.3219 | 24.0445 | 17.0433 |
---
## Curation
Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 2 column(s) with >80% missing values were removed: `expected_deaths`, `excess_deaths`. 1 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet.
---
## Limitations
- Data originates from HDX and has not been independently validated by ESA.
- Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection.
- The following columns have >20% missing values and should be treated with caution in modelling: `month`.
- This dataset spans 24 countries; geographic and methodological inconsistencies across national boundaries may affect cross-country comparability.
- Refer to the [original HDX dataset page](https://data.humdata.org/dataset/financial-times-excess-mortality-during-covid-19-pandemic-data) for the publisher's own methodology notes and caveats.
---
## Citation
```bibtex
@dataset{hdx_africa_financial_times_excess_mortality_during_covid_19_pandemic_data,
title = {Financial Times - Excess mortality during COVID-19 pandemic},
author = {HDX},
year = {2026},
url = {https://data.humdata.org/dataset/financial-times-excess-mortality-during-covid-19-pandemic-data},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
```
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica



