five

electricsheepafrica/africa-nam-rainfall-subnational

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-nam-rainfall-subnational
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - no-annotation language_creators: - found language: - en license: cc-by-4.0 multilinguality: - monolingual size_categories: - 100K<n<1M source_datasets: - original task_categories: - tabular-regression - other task_ids: [] tags: - africa - humanitarian - hdx - electric-sheep-africa - climate-weather - environment - nam pretty_name: "Namibia: Rainfall Indicators at Subnational Level" dataset_info: splits: - name: train num_examples: 169520 - name: test num_examples: 42380 --- # Namibia: Rainfall Indicators at Subnational Level **Publisher:** WFP - World Food Programme · **Source:** [HDX](https://data.humdata.org/dataset/nam-rainfall-subnational) · **License:** `cc-by` · **Updated:** 2026-04-03 --- ## Abstract This dataset contains dekadal rainfall indicators, computed from Climate Hazards Group InfraRed Precipitation satellite imagery with insitu Station data (CHIRPS) version 2 and the CHIRPS-GEFS short term rainfall forecasts, aggregated by subnational administrative units. Included indicators are (for each dekad): - 10 day rainfall [mm] (`rfh`) - rainfall 1-month rolling aggregation [mm] (`r1h`) - rainfall 3-month rolling aggregation [mm] (`r3h`) - rainfall long term average [mm] (`rfh_avg`) - rainfall 1-month rolling aggregation long term average [mm] (`r1h_avg`) - rainfall 3-month rolling aggregation long term average [mm] (`r3h_avg`) - rainfall anomaly [%] (`rfq`) - rainfall 1-month anomaly [%] (`r1q`) - rainfall 3-month anomaly [%] (`r3q`) The administrative units used for aggregation are based on WFP data and contain a Pcode reference attributed to each unit. The number of input pixels used to create the aggregates, is provided in the `n_pixels` column. Finally, the `type` column indicates if the value is based on a forecast, a preliminary or a final product. Forecasts are issued on the 6th, 16th, and 26th of each month for the upcoming 10-day period (dekad), then updated with improved versions on the 1st, 11th, and 21st. Preliminary observations replace the previous dekad’s forecast on the 3rd, 13th, and 23rd, and are later replaced by final observations—published mid-month (13th or 23rd)—covering all three dekads of the prior month. Please find a summary below: Publication Day: Forecast type, Covers (Dekad) - 1st: Updated forecast, 1–10 of the same month - 6th: Initial forecast, 11–20 of the same month - 11th: Updated forecast, 1–10 of the same month - 16th: Initial forecast, 21–end of the same month - 21st: Updated forecast, 11–20 of the same month - 26th: Initial forecast, 1–10 of the following month For more on CHIRPS-GEFS forecasts, see: https://www.chc.ucsb.edu/data/chirps-gefs For further details, please see the methodology section. Each row in this dataset represents time-series observations. Temporal coverage is indicated by the `date` column(s). Geographic scope: **NAM**. *Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).* --- ## Dataset Characteristics | | | |---|---| | **Domain** | Climate and environment | | **Unit of observation** | Time-series observations | | **Rows (total)** | 211,900 | | **Columns** | 17 (12 numeric, 4 categorical, 1 datetime) | | **Train split** | 169,520 rows | | **Test split** | 42,380 rows | | **Geographic scope** | NAM | | **Publisher** | WFP - World Food Programme | | **HDX last updated** | 2026-04-03 | --- ## Variables **Geographic** — `n_pixels` (range 1.0–5855.0). **Temporal** — `date`. **Identifier / Metadata** — `adm_id` (range 900789.0–1009001.0), `pcode` (NA1401, NA1201, NA0105), `esa_source` (HDX), `esa_processed` (2026-04-06). **Other** — `adm_level` (range 1.0–2.0), `rfh` (range 0.0–218.4286), `rfh_avg` (range 0.0–67.2908), `r1h` (range 0.0–449.4286), `r1h_avg` (range 0.0–173.8048) and 6 others. --- ## Quick Start ```python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-nam-rainfall-subnational") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() ``` --- ## Schema | Column | Type | Null % | Range / Sample Values | |---|---|---|---| | `date` | datetime64[ns] | 0.0% | | | `adm_level` | int64 | 0.0% | 1.0 – 2.0 (mean 1.8923) | | `adm_id` | int64 | 0.0% | 900789.0 – 1009001.0 (mean 997293.8615) | | `pcode` | object | 0.0% | NA1401, NA1201, NA0105 | | `n_pixels` | float64 | 0.0% | 1.0 – 5855.0 (mean 444.2154) | | `rfh` | float64 | 0.0% | 0.0 – 218.4286 (mean 9.8286) | | `rfh_avg` | float64 | 0.0% | 0.0 – 67.2908 (mean 10.2777) | | `r1h` | float64 | 0.1% | 0.0 – 449.4286 (mean 29.4758) | | `r1h_avg` | float64 | 0.1% | 0.0 – 173.8048 (mean 30.8776) | | `r3h` | float64 | 0.5% | 0.0 – 795.4419 (mean 88.1114) | | `r3h_avg` | float64 | 0.5% | 0.0 – 434.5 (mean 92.3847) | | `rfq` | float64 | 0.0% | 11.9843 – 575.5779 (mean 98.5557) | | `r1q` | float64 | 0.1% | 8.3363 – 467.0631 (mean 98.0786) | | `r3q` | float64 | 0.5% | 16.8454 – 439.8303 (mean 97.4343) | | `version` | object | 0.0% | final, prelim, forecast | | `esa_source` | object | 0.0% | HDX | | `esa_processed` | object | 0.0% | 2026-04-06 | --- ## Numeric Summary | Column | Min | Max | Mean | Median | |---|---|---|---|---| | `adm_level` | 1.0 | 2.0 | 1.8923 | 2.0 | | `adm_id` | 900789.0 | 1009001.0 | 997293.8615 | 1008932.5 | | `n_pixels` | 1.0 | 5855.0 | 444.2154 | 151.5 | | `rfh` | 0.0 | 218.4286 | 9.8286 | 2.1538 | | `rfh_avg` | 0.0 | 67.2908 | 10.2777 | 3.7166 | | `r1h` | 0.0 | 449.4286 | 29.4758 | 8.5098 | | `r1h_avg` | 0.0 | 173.8048 | 30.8776 | 11.095 | | `r3h` | 0.0 | 795.4419 | 88.1114 | 39.1996 | | `r3h_avg` | 0.0 | 434.5 | 92.3847 | 46.8246 | | `rfq` | 11.9843 | 575.5779 | 98.5557 | 99.6954 | | `r1q` | 8.3363 | 467.0631 | 98.0786 | 99.3131 | | `r3q` | 16.8454 | 439.8303 | 97.4343 | 96.2234 | --- ## Curation Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 1 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet. --- ## Limitations - Data originates from WFP - World Food Programme and has not been independently validated by ESA. - Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection. - Refer to the [original HDX dataset page](https://data.humdata.org/dataset/nam-rainfall-subnational) for the publisher's own methodology notes and caveats. --- ## Citation ```bibtex @dataset{hdx_africa_nam_rainfall_subnational, title = {Namibia: Rainfall Indicators at Subnational Level}, author = {WFP - World Food Programme}, year = {2026}, url = {https://data.humdata.org/dataset/nam-rainfall-subnational}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } ``` --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*

annotations_creators: - 无注释 language_creators: - 现有数据采集 language: - 英语 license: CC-BY-4.0 multilinguality: - 单语言 size_categories: - 10万<样本数<100万 source_datasets: - 原创数据集 task_categories: - 表格回归 - 其他 task_ids: - 无 tags: - 非洲 - 人道主义 - HDX - Electric Sheep Africa - 气候与天气 - 环境 - 纳米比亚(NAM) pretty_name: "纳米比亚:次级行政区划尺度降水指标" dataset_info: splits: - name: train num_examples: 169520 - name: test num_examples: 42380 # 纳米比亚:次级行政区划尺度降水指标 **发布方:** 世界粮食计划署(World Food Programme, WFP) · **数据来源:** [HDX](https://data.humdata.org/dataset/nam-rainfall-subnational) · **许可协议:** `cc-by` · **最后更新:** 2026-04-03 --- ## 摘要 本数据集包含旬度降水指标,其计算基于结合地面台站数据的气候灾害组红外降水卫星影像(Climate Hazards Group InfraRed Precipitation, CHIRPS)版本2数据,以及CHIRPS-GEFS短期降水预报数据,并按次级行政区划单元进行聚合。 包含的指标(针对每个旬)如下: - 10日降水量(单位:毫米,字段名`rfh`) - 1个月滑动聚合降水量(单位:毫米,字段名`r1h`) - 3个月滑动聚合降水量(单位:毫米,字段名`r3h`) - 降水量长期平均值(单位:毫米,字段名`rfh_avg`) - 1个月滑动聚合降水量长期平均值(单位:毫米,字段名`r1h_avg`) - 3个月滑动聚合降水量长期平均值(单位:毫米,字段名`r3h_avg`) - 降水距平百分率(单位:%,字段名`rfq`) - 1个月降水距平百分率(单位:%,字段名`r1q`) - 3个月降水距平百分率(单位:%,字段名`r3q`) 用于聚合的行政区划单元基于世界粮食计划署数据构建,每个单元均配有对应的Pcode(行政区划代码)。用于生成聚合数据的输入像素总数,会在`n_pixels`字段中给出。此外,`version`字段标注了数据类型:最终观测数据、初步观测数据或预报数据。 预报于每月6日、16日、26日发布,覆盖未来10天时段(旬),并于每月1日、11日、21日更新为优化版本。初步观测数据会在每月3日、13日、23日替换上一旬的预报数据,后续再由最终观测数据替换——最终观测数据于月中(13日或23日)发布,覆盖上月的全部三个旬。具体发布规则如下: | 发布日期 | 预报类型 | 覆盖时段(旬) | |---|---|---| | 1日 | 更新预报 | 当月1-10日 | | 6日 | 初始预报 | 当月11-20日 | | 11日 | 更新预报 | 当月1-10日 | | 16日 | 初始预报 | 当月21日至月末 | | 21日 | 更新预报 | 当月11-20日 | | 26日 | 初始预报 | 次月1-10日 | 如需了解更多CHIRPS-GEFS预报的相关信息,请访问:https://www.chc.ucsb.edu/data/chirps-gefs 如需进一步细节,请参阅方法学章节。 本数据集的每一行均代表时序观测数据,时间覆盖范围由`date`字段标注。地理覆盖范围:**纳米比亚(NAM)**。 本数据集由[Electric Sheep Africa](https://huggingface.co/electricsheepafrica)整理为适用于机器学习的Parquet格式。 --- ## 数据集特征 | | | |---|---| | **领域** | 气候与环境 | | **观测单元** | 时序观测数据 | | **总行数** | 211,900 | | **列数** | 17(12个数值型、4个分类型、1个日期时间型) | | **训练集样本量** | 169,520 | | **测试集样本量** | 42,380 | | **地理覆盖范围** | 纳米比亚(NAM) | | **发布方** | 世界粮食计划署(WFP) | | **HDX最后更新时间** | 2026-04-03 | --- ## 字段说明 **地理相关字段**:`n_pixels`(取值范围1.0–5855.0)。 **时间相关字段**:`date`。 **标识符与元数据字段**:`adm_id`(取值范围900789.0–1009001.0)、`pcode`(示例值:NA1401、NA1201、NA0105)、`esa_source`(HDX)、`esa_processed`(2026-04-06)。 **其他字段**:`adm_level`(取值范围1.0–2.0)、`rfh`(取值范围0.0–218.4286)、`rfh_avg`(取值范围0.0–67.2908)、`r1h`(取值范围0.0–449.4286)、`r1h_avg`(取值范围0.0–173.8048)及另外6个字段。 --- ## 快速上手 python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-nam-rainfall-subnational") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() --- ## 数据Schema | 字段名 | 数据类型 | 空值占比 | 取值范围/示例值 | |---|---|---|---| | `date` | datetime64[ns] | 0.0% | 无 | | `adm_level` | int64 | 0.0% | 1.0 – 2.0(均值:1.8923) | | `adm_id` | int64 | 0.0% | 900789.0 – 1009001.0(均值:997293.8615) | | `pcode` | object | 0.0% | NA1401、NA1201、NA0105 | | `n_pixels` | float64 | 0.0% | 1.0 – 5855.0(均值:444.2154) | | `rfh` | float64 | 0.0% | 0.0 – 218.4286(均值:9.8286) | | `rfh_avg` | float64 | 0.0% | 0.0 – 67.2908(均值:10.2777) | | `r1h` | float64 | 0.1% | 0.0 – 449.4286(均值:29.4758) | | `r1h_avg` | float64 | 0.1% | 0.0 – 173.8048(均值:30.8776) | | `r3h` | float64 | 0.5% | 0.0 – 795.4419(均值:88.1114) | | `r3h_avg` | float64 | 0.5% | 0.0 – 434.5(均值:92.3847) | | `rfq` | float64 | 0.0% | 11.9843 – 575.5779(均值:98.5557) | | `r1q` | float64 | 0.1% | 8.3363 – 467.0631(均值:98.0786) | | `r3q` | float64 | 0.5% | 16.8454 – 439.8303(均值:97.4343) | | `version` | object | 0.0% | final、prelim、forecast | | `esa_source` | object | 0.0% | HDX | | `esa_processed` | object | 0.0% | 2026-04-06 | --- ## 数值统计摘要 | 字段名 | 最小值 | 最大值 | 均值 | 中位数 | |---|---|---|---|---| | `adm_level` | 1.0 | 2.0 | 1.8923 | 2.0 | | `adm_id` | 900789.0 | 1009001.0 | 997293.8615 | 1008932.5 | | `n_pixels` | 1.0 | 5855.0 | 444.2154 | 151.5 | | `rfh` | 0.0 | 218.4286 | 9.8286 | 2.1538 | | `rfh_avg` | 0.0 | 67.2908 | 10.2777 | 3.7166 | | `r1h` | 0.0 | 449.4286 | 29.4758 | 8.5098 | | `r1h_avg` | 0.0 | 173.8048 | 30.8776 | 11.095 | | `r3h` | 0.0 | 795.4419 | 88.1114 | 39.1996 | | `r3h_avg` | 0.0 | 434.5 | 92.3847 | 46.8246 | | `rfq` | 11.9843 | 575.5779 | 98.5557 | 99.6954 | | `r1q` | 8.3363 | 467.0631 | 98.0786 | 99.3131 | | `r3q` | 16.8454 | 439.8303 | 97.4343 | 96.2234 | --- ## 数据整理流程 原始数据通过CKAN API从HDX下载并转换为Parquet格式。字段名统一转换为小写蛇形命名法。通用缺失值标记(`N/A`、`null`、`none`、`-`、`unknown`、`no data`、`#N/A`)被统一替换为`NaN`。基于解析成功率(阈值>85%),将1个字段从字符串类型转换为数值型或日期时间型。本数据集以固定随机种子(42)按80/20比例划分为训练集与测试集,并保存为Snappy压缩的Parquet格式。 --- ## 局限性 - 数据源自世界粮食计划署(WFP),未由Electric Sheep Africa进行独立验证。 - 自动化清洗无法修正原始数据收集中的错报值、定义不一致或采样偏差问题。 - 如需更多细节,请参阅[原始HDX数据集页面](https://data.humdata.org/dataset/nam-rainfall-subnational)中发布方提供的方法说明与免责条款。 --- ## 引用格式 bibtex @dataset{hdx_africa_nam_rainfall_subnational, title = {Namibia: Rainfall Indicators at Subnational Level}, author = {WFP - World Food Programme}, year = {2026}, url = {https://data.humdata.org/dataset/nam-rainfall-subnational}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — 非洲机器学习数据集基础设施。尼日利亚拉各斯。*
提供机构:
electricsheepafrica
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作