five

electricsheepafrica/africa-mar-rainfall-subnational

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-mar-rainfall-subnational
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - no-annotation language_creators: - found language: - en license: cc-by-4.0 multilinguality: - monolingual size_categories: - 100K<n<1M source_datasets: - original task_categories: - tabular-regression - other task_ids: [] tags: - africa - humanitarian - hdx - electric-sheep-africa - climate-weather - environment - mar pretty_name: "Morocco: Rainfall Indicators at Subnational Level" dataset_info: splits: - name: train num_examples: 89976 - name: test num_examples: 22494 --- # Morocco: Rainfall Indicators at Subnational Level **Publisher:** WFP - World Food Programme · **Source:** [HDX](https://data.humdata.org/dataset/mar-rainfall-subnational) · **License:** `cc-by` · **Updated:** 2026-04-03 --- ## Abstract This dataset contains dekadal rainfall indicators, computed from Climate Hazards Group InfraRed Precipitation satellite imagery with insitu Station data (CHIRPS) version 2 and the CHIRPS-GEFS short term rainfall forecasts, aggregated by subnational administrative units. Included indicators are (for each dekad): - 10 day rainfall [mm] (`rfh`) - rainfall 1-month rolling aggregation [mm] (`r1h`) - rainfall 3-month rolling aggregation [mm] (`r3h`) - rainfall long term average [mm] (`rfh_avg`) - rainfall 1-month rolling aggregation long term average [mm] (`r1h_avg`) - rainfall 3-month rolling aggregation long term average [mm] (`r3h_avg`) - rainfall anomaly [%] (`rfq`) - rainfall 1-month anomaly [%] (`r1q`) - rainfall 3-month anomaly [%] (`r3q`) The administrative units used for aggregation are based on WFP data and contain a Pcode reference attributed to each unit. The number of input pixels used to create the aggregates, is provided in the `n_pixels` column. Finally, the `type` column indicates if the value is based on a forecast, a preliminary or a final product. Forecasts are issued on the 6th, 16th, and 26th of each month for the upcoming 10-day period (dekad), then updated with improved versions on the 1st, 11th, and 21st. Preliminary observations replace the previous dekad’s forecast on the 3rd, 13th, and 23rd, and are later replaced by final observations—published mid-month (13th or 23rd)—covering all three dekads of the prior month. Please find a summary below: Publication Day: Forecast type, Covers (Dekad) - 1st: Updated forecast, 1–10 of the same month - 6th: Initial forecast, 11–20 of the same month - 11th: Updated forecast, 1–10 of the same month - 16th: Initial forecast, 21–end of the same month - 21st: Updated forecast, 11–20 of the same month - 26th: Initial forecast, 1–10 of the following month For more on CHIRPS-GEFS forecasts, see: https://www.chc.ucsb.edu/data/chirps-gefs For further details, please see the methodology section. Each row in this dataset represents time-series observations. Temporal coverage is indicated by the `date` column(s). Geographic scope: **MAR**. *Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).* --- ## Dataset Characteristics | | | |---|---| | **Domain** | Climate and environment | | **Unit of observation** | Time-series observations | | **Rows (total)** | 112,470 | | **Columns** | 17 (12 numeric, 4 categorical, 1 datetime) | | **Train split** | 89,976 rows | | **Test split** | 22,494 rows | | **Geographic scope** | MAR | | **Publisher** | WFP - World Food Programme | | **HDX last updated** | 2026-04-03 | --- ## Variables **Geographic** — `n_pixels` (range 2.0–2704.0). **Temporal** — `date`. **Identifier / Metadata** — `adm_id` (range 2107.0–999616.0), `pcode` (MA002, MA003, MA004), `esa_source` (HDX), `esa_processed` (2026-04-06). **Other** — `adm_level` (range 1.0–2.0), `rfh` (range 0.0–265.0748), `rfh_avg` (range 0.0–63.4745), `r1h` (range 0.0–617.1176), `r1h_avg` (range 0.0–156.7315) and 6 others. --- ## Quick Start ```python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-mar-rainfall-subnational") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() ``` --- ## Schema | Column | Type | Null % | Range / Sample Values | |---|---|---|---| | `date` | datetime64[ns] | 0.0% | | | `adm_level` | int64 | 0.0% | 1.0 – 2.0 (mean 1.7826) | | `adm_id` | int64 | 0.0% | 2107.0 – 999616.0 (mean 408742.5797) | | `pcode` | object | 0.0% | MA002, MA003, MA004 | | `n_pixels` | float64 | 0.0% | 2.0 – 2704.0 (mean 458.5652) | | `rfh` | float64 | 0.0% | 0.0 – 265.0748 (mean 10.2848) | | `rfh_avg` | float64 | 0.0% | 0.0 – 63.4745 (mean 10.8049) | | `r1h` | float64 | 0.1% | 0.0 – 617.1176 (mean 30.8701) | | `r1h_avg` | float64 | 0.1% | 0.0 – 156.7315 (mean 32.3882) | | `r3h` | float64 | 0.5% | 0.0 – 1067.1307 (mean 92.4746) | | `r3h_avg` | float64 | 0.5% | 0.1484 – 391.8053 (mean 97.0219) | | `rfq` | float64 | 0.0% | 11.0839 – 694.696 (mean 97.7944) | | `r1q` | float64 | 0.1% | 8.8707 – 505.7507 (mean 96.9603) | | `r3q` | float64 | 0.5% | 17.4083 – 370.8018 (mean 96.1768) | | `version` | object | 0.0% | final, prelim, forecast | | `esa_source` | object | 0.0% | HDX | | `esa_processed` | object | 0.0% | 2026-04-06 | --- ## Numeric Summary | Column | Min | Max | Mean | Median | |---|---|---|---|---| | `adm_level` | 1.0 | 2.0 | 1.7826 | 2.0 | | `adm_id` | 2107.0 | 999616.0 | 408742.5797 | 147350.0 | | `n_pixels` | 2.0 | 2704.0 | 458.5652 | 248.0 | | `rfh` | 0.0 | 265.0748 | 10.2848 | 5.1034 | | `rfh_avg` | 0.0 | 63.4745 | 10.8049 | 9.0767 | | `r1h` | 0.0 | 617.1176 | 30.8701 | 19.4706 | | `r1h_avg` | 0.0 | 156.7315 | 32.3882 | 27.771 | | `r3h` | 0.0 | 1067.1307 | 92.4746 | 70.9076 | | `r3h_avg` | 0.1484 | 391.8053 | 97.0219 | 83.0039 | | `rfq` | 11.0839 | 694.696 | 97.7944 | 94.2494 | | `r1q` | 8.8707 | 505.7507 | 96.9603 | 93.362 | | `r3q` | 17.4083 | 370.8018 | 96.1768 | 90.9171 | --- ## Curation Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 1 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet. --- ## Limitations - Data originates from WFP - World Food Programme and has not been independently validated by ESA. - Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection. - Refer to the [original HDX dataset page](https://data.humdata.org/dataset/mar-rainfall-subnational) for the publisher's own methodology notes and caveats. --- ## Citation ```bibtex @dataset{hdx_africa_mar_rainfall_subnational, title = {Morocco: Rainfall Indicators at Subnational Level}, author = {WFP - World Food Programme}, year = {2026}, url = {https://data.humdata.org/dataset/mar-rainfall-subnational}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } ``` --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作