electricsheepafrica/africa-tgo-rainfall-subnational
收藏Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-tgo-rainfall-subnational
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- no-annotation
language_creators:
- found
language:
- en
license: cc-by-4.0
multilinguality:
- monolingual
size_categories:
- 10K<n<100K
source_datasets:
- original
task_categories:
- tabular-regression
- other
task_ids: []
tags:
- africa
- humanitarian
- hdx
- electric-sheep-africa
- climate-weather
- environment
- tgo
pretty_name: "Togo: Rainfall Indicators at Subnational Level"
dataset_info:
splits:
- name: train
num_examples: 45640
- name: test
num_examples: 11410
---
# Togo: Rainfall Indicators at Subnational Level
**Publisher:** WFP - World Food Programme · **Source:** [HDX](https://data.humdata.org/dataset/tgo-rainfall-subnational) · **License:** `cc-by` · **Updated:** 2026-04-03
---
## Abstract
This dataset contains dekadal rainfall indicators, computed from Climate Hazards Group InfraRed Precipitation satellite imagery with insitu Station data (CHIRPS) version 2 and the CHIRPS-GEFS short term rainfall forecasts, aggregated by subnational administrative units.
Included indicators are (for each dekad):
- 10 day rainfall [mm] (`rfh`)
- rainfall 1-month rolling aggregation [mm] (`r1h`)
- rainfall 3-month rolling aggregation [mm] (`r3h`)
- rainfall long term average [mm] (`rfh_avg`)
- rainfall 1-month rolling aggregation long term average [mm] (`r1h_avg`)
- rainfall 3-month rolling aggregation long term average [mm] (`r3h_avg`)
- rainfall anomaly [%] (`rfq`)
- rainfall 1-month anomaly [%] (`r1q`)
- rainfall 3-month anomaly [%] (`r3q`)
The administrative units used for aggregation are based on WFP data and contain a Pcode reference attributed to each unit. The number of input pixels used to create the aggregates, is provided in the `n_pixels` column. Finally, the `type` column indicates if the value is based on a forecast, a preliminary or a final product.
Forecasts are issued on the 6th, 16th, and 26th of each month for the upcoming 10-day period (dekad), then updated with improved versions on the 1st, 11th, and 21st.
Preliminary observations replace the previous dekad’s forecast on the 3rd, 13th, and 23rd, and are later replaced by final observations—published mid-month (13th or 23rd)—covering all three dekads of the prior month. Please find a summary below:
Publication Day: Forecast type, Covers (Dekad)
- 1st: Updated forecast, 1–10 of the same month
- 6th: Initial forecast, 11–20 of the same month
- 11th: Updated forecast, 1–10 of the same month
- 16th: Initial forecast, 21–end of the same month
- 21st: Updated forecast, 11–20 of the same month
- 26th: Initial forecast, 1–10 of the following month
For more on CHIRPS-GEFS forecasts, see: https://www.chc.ucsb.edu/data/chirps-gefs
For further details, please see the methodology section.
Each row in this dataset represents time-series observations. Temporal coverage is indicated by the `date` column(s). Geographic scope: **TGO**.
*Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).*
---
## Dataset Characteristics
| | |
|---|---|
| **Domain** | Climate and environment |
| **Unit of observation** | Time-series observations |
| **Rows (total)** | 57,050 |
| **Columns** | 17 (12 numeric, 4 categorical, 1 datetime) |
| **Train split** | 45,640 rows |
| **Test split** | 11,410 rows |
| **Geographic scope** | TGO |
| **Publisher** | WFP - World Food Programme |
| **HDX last updated** | 2026-04-03 |
---
## Variables
**Geographic** — `n_pixels` (range 11.0–564.0).
**Temporal** — `date`.
**Identifier / Metadata** — `adm_id` (range 2970.0–65288.0), `pcode` (TG01, TG0407, TG0306), `esa_source` (HDX), `esa_processed` (2026-04-06).
**Other** — `adm_level` (range 1.0–2.0), `rfh` (range 0.0–298.5385), `rfh_avg` (range 0.0017–124.8204), `r1h` (range 0.0417–632.0769), `r1h_avg` (range 0.1333–352.5602) and 6 others.
---
## Quick Start
```python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-tgo-rainfall-subnational")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
```
---
## Schema
| Column | Type | Null % | Range / Sample Values |
|---|---|---|---|
| `date` | datetime64[ns] | 0.0% | |
| `adm_level` | int64 | 0.0% | 1.0 – 2.0 (mean 1.8571) |
| `adm_id` | int64 | 0.0% | 2970.0 – 65288.0 (mean 42306.6) |
| `pcode` | object | 0.0% | TG01, TG0407, TG0306 |
| `n_pixels` | float64 | 0.0% | 11.0 – 564.0 (mean 106.9429) |
| `rfh` | float64 | 0.0% | 0.0 – 298.5385 (mean 33.5456) |
| `rfh_avg` | float64 | 0.0% | 0.0017 – 124.8204 (mean 33.6285) |
| `r1h` | float64 | 0.1% | 0.0417 – 632.0769 (mean 100.7206) |
| `r1h_avg` | float64 | 0.1% | 0.1333 – 352.5602 (mean 100.9572) |
| `r3h` | float64 | 0.5% | 1.125 – 1187.1613 (mean 302.8479) |
| `r3h_avg` | float64 | 0.5% | 2.0917 – 904.3011 (mean 303.6439) |
| `rfq` | float64 | 0.0% | 11.6196 – 642.7259 (mean 100.1677) |
| `r1q` | float64 | 0.1% | 12.2134 – 536.4545 (mean 100.0118) |
| `r3q` | float64 | 0.5% | 16.9567 – 375.8452 (mean 99.9516) |
| `version` | object | 0.0% | final, prelim, forecast |
| `esa_source` | object | 0.0% | HDX |
| `esa_processed` | object | 0.0% | 2026-04-06 |
---
## Numeric Summary
| Column | Min | Max | Mean | Median |
|---|---|---|---|---|
| `adm_level` | 1.0 | 2.0 | 1.8571 | 2.0 |
| `adm_id` | 2970.0 | 65288.0 | 42306.6 | 27403.0 |
| `n_pixels` | 11.0 | 564.0 | 106.9429 | 60.0 |
| `rfh` | 0.0 | 298.5385 | 33.5456 | 25.2381 |
| `rfh_avg` | 0.0017 | 124.8204 | 33.6285 | 32.7208 |
| `r1h` | 0.0417 | 632.0769 | 100.7206 | 89.0035 |
| `r1h_avg` | 0.1333 | 352.5602 | 100.9572 | 96.2889 |
| `r3h` | 1.125 | 1187.1613 | 302.8479 | 284.2417 |
| `r3h_avg` | 2.0917 | 904.3011 | 303.6439 | 287.6502 |
| `rfq` | 11.6196 | 642.7259 | 100.1677 | 94.8276 |
| `r1q` | 12.2134 | 536.4545 | 100.0118 | 96.3704 |
| `r3q` | 16.9567 | 375.8452 | 99.9516 | 98.1553 |
---
## Curation
Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 1 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet.
---
## Limitations
- Data originates from WFP - World Food Programme and has not been independently validated by ESA.
- Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection.
- Refer to the [original HDX dataset page](https://data.humdata.org/dataset/tgo-rainfall-subnational) for the publisher's own methodology notes and caveats.
---
## Citation
```bibtex
@dataset{hdx_africa_tgo_rainfall_subnational,
title = {Togo: Rainfall Indicators at Subnational Level},
author = {WFP - World Food Programme},
year = {2026},
url = {https://data.humdata.org/dataset/tgo-rainfall-subnational},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
```
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
annotations_creators:
- 无注释
language_creators:
- 现有公开资源
language:
- 英语
license: 知识共享署名4.0协议(CC-BY-4.0)
multilinguality:
- 单语言
size_categories:
- 10000 < 样本数 < 100000
source_datasets:
- 原始数据集
task_categories:
- 表格回归
- 其他
task_ids: []
tags:
- 非洲
- 人道主义
- HDX
- Electric Sheep Africa
- 气候与天气
- 环境
- 多哥(TGO)
pretty_name: "多哥:次国家层级降雨指标"
dataset_info:
splits:
- name: 训练集
num_examples: 45640
- name: 测试集
num_examples: 11410
---
# 多哥:次国家层级降雨指标
**发布方:** 世界粮食计划署(WFP - World Food Programme) · **来源:** [HDX(人道主义数据交换)](https://data.humdata.org/dataset/tgo-rainfall-subnational) · **许可协议:** `cc-by` · **更新时间:** 2026-04-03
---
## 摘要
本数据集包含旬度降雨指标,由气候灾害小组红外降水卫星影像(CHIRPS,Climate Hazards Group InfraRed Precipitation)版本2结合原位站数据,以及CHIRPS-GEFS短期降雨预报数据计算得到,并按次国家行政单元聚合。
包含的指标(每个旬度):
- 10日降雨量[毫米](`rfh`)
- 1个月滚动累计降雨量[毫米](`r1h`)
- 3个月滚动累计降雨量[毫米](`r3h`)
- 降雨量长期平均值[毫米](`rfh_avg`)
- 1个月滚动累计降雨量长期平均值[毫米](`r1h_avg`)
- 3个月滚动累计降雨量长期平均值[毫米](`r3h_avg`)
- 降雨量距平百分比[%](`rfq`)
- 1个月滚动累计降雨量距平百分比[%](`r1q`)
- 3个月滚动累计降雨量距平百分比[%](`r3q`)
用于聚合的行政单元基于世界粮食计划署数据,每个单元均带有Pcode标识。用于生成聚合数据的输入像素数量将在`n_pixels`列中给出。最后,`version`列(注:原文早期描述为`type`列,后续统一为`version`)用于标识数据类型:预报数据、初步观测数据或最终观测数据。
预报于每月6日、16日、26日发布,覆盖未来10天(旬度)的天气情况,并于每月1日、11日、21日更新为优化版本。初步观测数据会在每月3日、13日、23日替换上一个旬度的预报数据,后续将由最终观测数据替代——最终观测数据于月中(13日或23日)发布,覆盖上月全部三个旬度。以下为发布时间表:
| 发布日期 | 预报类型 | 覆盖旬度 |
|---|---|---|
| 1日 | 更新预报 | 当月1-10日 |
| 6日 | 初始预报 | 当月11-20日 |
| 11日 | 更新预报 | 当月1-10日 |
| 16日 | 初始预报 | 当月21日至当月月末 |
| 21日 | 更新预报 | 当月11-20日 |
| 26日 | 初始预报 | 次月1-10日 |
如需了解CHIRPS-GEFS预报的更多信息,请访问:https://www.chc.ucsb.edu/data/chirps-gefs。更多细节请参阅方法学章节。
本数据集的每一行代表一条时间序列观测值,时间覆盖范围由`date`列指示。地理范围:**多哥(TGO)**。本数据集已由[Electric Sheep Africa](https://huggingface.co/electricsheepafrica)整理为适合机器学习的Parquet格式。
---
## 数据集特征
| 项目 | 详情 |
|---|---|
| **领域** | 气候与环境 |
| **观测单元** | 时间序列观测值 |
| **总行数** | 57050 |
| **列数** | 17列(12列数值型、4列分类型、1列日期型) |
| **训练集划分** | 45640行 |
| **测试集划分** | 11410行 |
| **地理范围** | 多哥(TGO) |
| **发布方** | 世界粮食计划署(WFP - World Food Programme) |
| **HDX最后更新时间** | 2026-04-03 |
---
## 变量说明
**地理相关变量**:`n_pixels`(取值范围11.0–564.0)。
**时间相关变量**:`date`。
**标识/元数据变量**:`adm_id`(取值范围2970.0–65288.0)、`pcode`(示例值:TG01、TG0407、TG0306)、`esa_source`(固定值:HDX)、`esa_processed`(数据处理时间:2026-04-06)。
**其他变量**:`adm_level`(取值范围1.0–2.0)、`rfh`(取值范围0.0–298.5385)、`rfh_avg`(取值范围0.0017–124.8204)、`r1h`(取值范围0.0417–632.0769)、`r1h_avg`(取值范围0.1333–352.5602)及另外6个指标。
---
## 快速上手
python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-tgo-rainfall-subnational")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
---
## 数据 schema
| 列名 | 数据类型 | 空值占比 | 取值范围/示例值 |
|---|---|---|---|
| `date` | datetime64[ns] | 0.0% | |
| `adm_level` | int64 | 0.0% | 1.0 – 2.0(均值1.8571) |
| `adm_id` | int64 | 0.0% | 2970.0 – 65288.0(均值42306.6) |
| `pcode` | object | 0.0% | TG01、TG0407、TG0306 |
| `n_pixels` | float64 | 0.0% | 11.0 – 564.0(均值106.9429) |
| `rfh` | float64 | 0.0% | 0.0 – 298.5385(均值33.5456) |
| `rfh_avg` | float64 | 0.0% | 0.0017 – 124.8204(均值33.6285) |
| `r1h` | float64 | 0.1% | 0.0417 – 632.0769(均值100.7206) |
| `r1h_avg` | float64 | 0.1% | 0.1333 – 352.5602(均值100.9572) |
| `r3h` | float64 | 0.5% | 1.125 – 1187.1613(均值302.8479) |
| `r3h_avg` | float64 | 0.5% | 2.0917 – 904.3011(均值303.6439) |
| `rfq` | float64 | 0.0% | 11.6196 – 642.7259(均值100.1677) |
| `r1q` | float64 | 0.1% | 12.2134 – 536.4545(均值100.0118) |
| `r3q` | float64 | 0.5% | 16.9567 – 375.8452(均值99.9516) |
| `version` | object | 0.0% | final、prelim、forecast |
| `esa_source` | object | 0.0% | HDX |
| `esa_processed` | object | 0.0% | 2026-04-06 |
---
## 数值型变量统计摘要
| 列名 | 最小值 | 最大值 | 均值 | 中位数 |
|---|---|---|---|---|
| `adm_level` | 1.0 | 2.0 | 1.8571 | 2.0 |
| `adm_id` | 2970.0 | 65288.0 | 42306.6 | 27403.0 |
| `n_pixels` | 11.0 | 564.0 | 106.9429 | 60.0 |
| `rfh` | 0.0 | 298.5385 | 33.5456 | 25.2381 |
| `rfh_avg` | 0.0017 | 124.8204 | 33.6285 | 32.7208 |
| `r1h` | 0.0417 | 632.0769 | 100.7206 | 89.0035 |
| `r1h_avg` | 0.1333 | 352.5602 | 100.9572 | 96.2889 |
| `r3h` | 1.125 | 1187.1613 | 302.8479 | 284.2417 |
| `r3h_avg` | 2.0917 | 904.3011 | 303.6439 | 287.6502 |
| `rfq` | 11.6196 | 642.7259 | 100.1677 | 94.8276 |
| `r1q` | 12.2134 | 536.4545 | 100.0118 | 96.3704 |
| `r3q` | 16.9567 | 375.8452 | 99.9516 | 98.1553 |
---
## 数据整理流程
原始数据通过CKAN API从HDX下载,并转换为Parquet格式。列名统一转换为小写蛇形命名法。常见缺失值标记(`N/A`、`null`、`none`、`-`、`unknown`、`no data`、`#N/A`)被统一替换为`NaN`。基于解析成功率(>85%阈值),将1列从字符串类型转换为数值型或日期型。数据集以固定随机种子(42)按80/20比例划分为训练集和测试集,并保存为Snappy压缩的Parquet格式。
---
## 局限性说明
- 数据源自世界粮食计划署(WFP - World Food Programme),未由Electric Sheep Africa(ESA)进行独立验证。
- 自动化清洗流程无法修正原始数据收集中的误报值、定义不一致或采样偏差问题。
- 请参阅[原始HDX数据集页面](https://data.humdata.org/dataset/tgo-rainfall-subnational)获取发布方提供的方法学说明与免责声明。
---
## 引用格式
bibtex
@dataset{hdx_africa_tgo_rainfall_subnational,
title = {Togo: Rainfall Indicators at Subnational Level},
author = {WFP - World Food Programme},
year = {2026},
url = {https://data.humdata.org/dataset/tgo-rainfall-subnational},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — 非洲的机器学习数据集基础设施。尼日利亚拉各斯。*
提供机构:
electricsheepafrica



