electricsheepafrica/africa-wfp-food-prices-for-zimbabwe
收藏Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-wfp-food-prices-for-zimbabwe
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- no-annotation
language_creators:
- found
language:
- en
license: cc-by-4.0
multilinguality:
- monolingual
size_categories:
- 10K<n<100K
source_datasets:
- original
task_categories:
- tabular-regression
- other
task_ids: []
tags:
- africa
- humanitarian
- hdx
- electric-sheep-africa
- economics
- food-security
- indicators
- markets
- zwe
pretty_name: "Zimbabwe - Food Prices"
dataset_info:
splits:
- name: train
num_examples: 14348
- name: test
num_examples: 3587
---
# Zimbabwe - Food Prices
**Publisher:** WFP - World Food Programme · **Source:** [HDX](https://data.humdata.org/dataset/wfp-food-prices-for-zimbabwe) · **License:** `cc-by-igo` · **Updated:** 2026-04-05
---
## Abstract
This dataset contains Food Prices data for Zimbabwe, sourced from the World Food Programme Price Database. The World Food Programme Price Database covers foods such as maize, rice, beans, fish, and sugar for 98 countries and some 3000 markets. It is updated weekly but contains to a large extent monthly data. The data goes back as far as 1992 for a few countries, although many countries started reporting from 2003 or thereafter.
Each row in this dataset represents subnational administrative unit observations. Temporal coverage is indicated by the `date` column(s). Geographic scope: **ZWE**.
*Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).*
---
## Dataset Characteristics
| | |
|---|---|
| **Domain** | Food security and nutrition |
| **Unit of observation** | Subnational administrative unit observations |
| **Rows (total)** | 17,936 |
| **Columns** | 18 (6 numeric, 11 categorical, 1 datetime) |
| **Train split** | 14,348 rows |
| **Test split** | 3,587 rows |
| **Geographic scope** | ZWE |
| **Publisher** | WFP - World Food Programme |
| **HDX last updated** | 2026-04-05 |
---
## Variables
**Geographic** — `admin1` (Masvingo, Midlands, Mashonaland East), `admin2` (Mudzi, Rushinga, Mwenezi), `latitude` (range -22.2–-15.97), `longitude` (range 25.83–32.95), `category` (cereals and tubers, non-food, miscellaneous food) and 4 others.
**Temporal** — `date`.
**Outcome / Measurement** — `priceflag` (aggregate, actual), `price` (range 0.13–160000.0), `usdprice` (range 0.0–7.5).
**Identifier / Metadata** — `market_id` (range 708.0–8913.0), `esa_source` (HDX), `esa_processed`.
**Other** — `market` (Chiredzi Urban, Gweru Urban, Nkayi Growth Point), `unit` (KG, L, 250 G).
---
## Quick Start
```python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-wfp-food-prices-for-zimbabwe")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
```
---
## Schema
| Column | Type | Null % | Range / Sample Values |
|---|---|---|---|
| `date` | datetime64[ns] | 0.0% | |
| `admin1` | object | 4.0% | Masvingo, Midlands, Mashonaland East |
| `admin2` | object | 4.0% | Mudzi, Rushinga, Mwenezi |
| `market` | object | 0.0% | Chiredzi Urban, Gweru Urban, Nkayi Growth Point |
| `market_id` | int64 | 0.0% | 708.0 – 8913.0 (mean 4178.2567) |
| `latitude` | float64 | 4.0% | -22.2 – -15.97 (mean -18.9962) |
| `longitude` | float64 | 4.0% | 25.83 – 32.95 (mean 30.3135) |
| `category` | object | 0.0% | cereals and tubers, non-food, miscellaneous food |
| `commodity` | object | 0.0% | Oil (vegetable), Salt, Sugar |
| `commodity_id` | int64 | 0.0% | 50.0 – 887.0 (mean 307.9432) |
| `unit` | object | 0.0% | KG, L, 250 G |
| `priceflag` | object | 0.0% | aggregate, actual |
| `pricetype` | object | 0.0% | Retail |
| `currency` | object | 0.0% | ZWL, USD |
| `price` | float64 | 0.0% | 0.13 – 160000.0 (mean 1874.5285) |
| `usdprice` | float64 | 0.0% | 0.0 – 7.5 (mean 0.1827) |
| `esa_source` | object | 0.0% | HDX |
| `esa_processed` | object | 0.0% | |
---
## Numeric Summary
| Column | Min | Max | Mean | Median |
|---|---|---|---|---|
| `market_id` | 708.0 | 8913.0 | 4178.2567 | 5403.0 |
| `latitude` | -22.2 | -15.97 | -18.9962 | -19.0 |
| `longitude` | 25.83 | 32.95 | 30.3135 | 30.51 |
| `commodity_id` | 50.0 | 887.0 | 307.9432 | 185.0 |
| `price` | 0.13 | 160000.0 | 1874.5285 | 110.0 |
| `usdprice` | 0.0 | 7.5 | 0.1827 | 0.0037 |
---
## Curation
Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 1 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet.
---
## Limitations
- Data originates from WFP - World Food Programme and has not been independently validated by ESA.
- Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection.
- Refer to the [original HDX dataset page](https://data.humdata.org/dataset/wfp-food-prices-for-zimbabwe) for the publisher's own methodology notes and caveats.
---
## Citation
```bibtex
@dataset{hdx_africa_wfp_food_prices_for_zimbabwe,
title = {Zimbabwe - Food Prices},
author = {WFP - World Food Programme},
year = {2026},
url = {https://data.humdata.org/dataset/wfp-food-prices-for-zimbabwe},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
```
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
---
标注创建者:
- 无标注
语言创建方式:
- 发现式(从公开资源采集)
语言:
- 英语
许可证: CC-BY-4.0
多语言类型:
- 单语言
样本量范围:
- 10000 < 样本数 < 100000
源数据集:
- 原创数据集
任务类别:
- 表格回归
- 其他
任务子类别: []
标签:
- 非洲
- 人道主义
- HDX
- Electric Sheep Africa
- 经济学
- 粮食安全
- 指标
- 市场
- ZWE
展示名称: "津巴布韦——食品价格"
数据集信息:
数据集划分:
- 名称: 训练集
样本数: 14348
- 名称: 测试集
样本数: 3587
---
# 津巴布韦——食品价格
**发布方:世界粮食计划署(WFP)** · **来源:[人道主义数据交换平台(HDX)](https://data.humdata.org/dataset/wfp-food-prices-for-zimbabwe)** · **许可证:`cc-by-igo`** · **更新时间:2026-04-05**
---
## 摘要
本数据集包含源自世界粮食计划署(WFP)价格数据库的津巴布韦食品价格数据。世界粮食计划署价格数据库覆盖了98个国家约3000个市场的玉米、大米、豆类、鱼类及食糖等食品价格信息。该数据库每周更新,但数据主体为月度统计值。部分国家的最早数据可追溯至1992年,多数国家则自2003年及之后开始上报数据。
每一行数据均代表次国家级行政单元的观测记录,时间范围由`date`(日期)列标识。地理覆盖范围:**津巴布韦(ZWE)**。
*本数据集已由[Electric Sheep Africa](https://huggingface.co/electricsheepafrica)整理为适配机器学习的Parquet格式。*
---
## 数据集特征
| | |
|---|---|
| **领域** | 粮食安全与营养 |
| **观测单元** | 次国家级行政单元观测记录 |
| **总样本行数** | 17936 |
| **列数** | 18列(6个数值型、11个分类型、1个日期时间型) |
| **训练集划分** | 14348行 |
| **测试集划分** | 3587行 |
| **地理覆盖范围** | 津巴布韦(ZWE) |
| **发布方** | 世界粮食计划署(WFP) |
| **HDX最后更新时间** | 2026-04-05 |
---
## 变量说明
**地理类变量** — `admin1`(马辛戈省、中部省、东马绍纳兰省)、`admin2`(穆德齐、拉欣加、姆韦内齐)、`latitude`(纬度范围:-22.2~-15.97)、`longitude`(经度范围:25.83~32.95)、`category`(类别:谷物与块根作物、非食品、杂项食品)及另外4个变量。
**时间类变量** — `date`(日期)。
**结果/测量类变量** — `priceflag`(价格标记:汇总值、实际值)、`price`(价格范围:0.13~160000.0)、`usdprice`(美元计价价格范围:0.0~7.5)。
**标识/元数据类变量** — `market_id`(市场ID范围:708.0~8913.0)、`esa_source`(数据来源:HDX)、`esa_processed`(处理标记)。
**其他变量** — `market`(市场名称:奇雷齐市区、奎鲁市区、恩凯伊增长点)、`unit`(计价单位:千克、升、250克)。
---
## 快速上手
python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-wfp-food-prices-for-zimbabwe")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
---
## 数据Schema
| 列名 | 数据类型 | 空值占比 | 取值范围/示例值 |
|---|---|---|---|
| `date` | datetime64[ns] | 0.0% | |
| `admin1` | object | 4.0% | 马辛戈省、中部省、东马绍纳兰省 |
| `admin2` | object | 4.0% | 穆德齐、拉欣加、姆韦内齐 |
| `market` | object | 0.0% | 奇雷齐市区、奎鲁市区、恩凯伊增长点 |
| `market_id` | int64 | 0.0% | 708.0 – 8913.0(均值:4178.2567) |
| `latitude` | float64 | 4.0% | -22.2 – -15.97(均值:-18.9962) |
| `longitude` | float64 | 4.0% | 25.83 – 32.95(均值:30.3135) |
| `category` | object | 0.0% | 谷物与块根作物、非食品、杂项食品 |
| `commodity` | object | 0.0% | 植物油、食盐、食糖 |
| `commodity_id` | int64 | 0.0% | 50.0 – 887.0(均值:307.9432) |
| `unit` | object | 0.0% | 千克、升、250克 |
| `priceflag` | object | 0.0% | 汇总值、实际值 |
| `pricetype` | object | 0.0% | 零售 |
| `currency` | object | 0.0% | 津巴布韦元(ZWL)、美元(USD) |
| `price` | float64 | 0.0% | 0.13 – 160000.0(均值:1874.5285) |
| `usdprice` | float64 | 0.0% | 0.0 – 7.5(均值:0.1827) |
| `esa_source` | object | 0.0% | HDX |
| `esa_processed` | object | 0.0% | |
---
## 数值型变量统计摘要
| 列名 | 最小值 | 最大值 | 均值 | 中位数 |
|---|---|---|---|---|
| `market_id` | 708.0 | 8913.0 | 4178.2567 | 5403.0 |
| `latitude` | -22.2 | -15.97 | -18.9962 | -19.0 |
| `longitude` | 25.83 | 32.95 | 30.3135 | 30.51 |
| `commodity_id` | 50.0 | 887.0 | 307.9432 | 185.0 |
| `price` | 0.13 | 160000.0 | 1874.5285 | 110.0 |
| `usdprice` | 0.0 | 7.5 | 0.1827 | 0.0037 |
---
## 数据整理流程
原始数据通过CKAN应用程序编程接口(CKAN API)从HDX下载,并转换为Parquet格式。列名统一转换为小写并标准化为蛇形命名法(snake_case)。常见缺失值标记(`N/A`、`null`、`none`、`-`、`unknown`、`no data`、`#N/A`)被统一替换为`NaN`。根据解析成功率(阈值>85%),将1列从字符串类型转换为数值型或日期时间型。本数据集以固定随机种子(42)按80/20比例划分为训练集与测试集,并保存为Snappy压缩的Parquet格式。
---
## 数据集局限性
- 数据源自世界粮食计划署(WFP),未由Electric Sheep Africa进行独立验证。
- 自动化清洗流程无法修正原始数据收集中的错报值、定义不一致或抽样偏差问题。
- 请参阅[原始HDX数据集页面](https://data.humdata.org/dataset/wfp-food-prices-for-zimbabwe)获取发布方提供的方法论说明与注意事项。
---
## 引用格式
bibtex
@dataset{hdx_africa_wfp_food_prices_for_zimbabwe,
title = {Zimbabwe - Food Prices},
author = {WFP - World Food Programme},
year = {2026},
url = {https://data.humdata.org/dataset/wfp-food-prices-for-zimbabwe},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — 非洲机器学习数据集基础设施提供商,总部位于尼日利亚拉各斯。*
提供机构:
electricsheepafrica



