five

electricsheepafrica/africa-wfp-food-prices-for-cameroon

收藏
Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-wfp-food-prices-for-cameroon
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - no-annotation language_creators: - found language: - en license: cc-by-4.0 multilinguality: - monolingual size_categories: - 10K<n<100K source_datasets: - original task_categories: - tabular-regression - other task_ids: [] tags: - africa - humanitarian - hdx - electric-sheep-africa - economics - food-security - indicators - markets - cmr pretty_name: "Cameroon - Food Prices" dataset_info: splits: - name: train num_examples: 42400 - name: test num_examples: 10600 --- # Cameroon - Food Prices **Publisher:** WFP - World Food Programme · **Source:** [HDX](https://data.humdata.org/dataset/wfp-food-prices-for-cameroon) · **License:** `cc-by-igo` · **Updated:** 2026-04-05 --- ## Abstract This dataset contains Food Prices data for Cameroon, sourced from the World Food Programme Price Database. The World Food Programme Price Database covers foods such as maize, rice, beans, fish, and sugar for 98 countries and some 3000 markets. It is updated weekly but contains to a large extent monthly data. The data goes back as far as 1992 for a few countries, although many countries started reporting from 2003 or thereafter. Each row in this dataset represents subnational administrative unit observations. Temporal coverage is indicated by the `date` column(s). Geographic scope: **CMR**. *Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).* --- ## Dataset Characteristics | | | |---|---| | **Domain** | Food security and nutrition | | **Unit of observation** | Subnational administrative unit observations | | **Rows (total)** | 53,000 | | **Columns** | 18 (6 numeric, 11 categorical, 1 datetime) | | **Train split** | 42,400 rows | | **Test split** | 10,600 rows | | **Geographic scope** | CMR | | **Publisher** | WFP - World Food Programme | | **HDX last updated** | 2026-04-05 | --- ## Variables **Geographic** — `admin1` (Extrême-Nord, Sud-Ouest, Nord-Ouest), `admin2` (Fako, Logone-et-Chari, Lom-et-Djérem), `latitude` (range 3.86–12.39), `longitude` (range 9.17–15.45), `category` (cereals and tubers, pulses and nuts, vegetables and fruits) and 4 others. **Temporal** — `date`. **Outcome / Measurement** — `priceflag` (actual), `price` (range 3.0–160000.0), `usdprice` (range 0.005–262.3). **Identifier / Metadata** — `market_id` (range 1578.0–8559.0), `esa_source` (HDX), `esa_processed`. **Other** — `market` (Kousseri, Mokolo, Maroua), `unit` (KG, 90 KG, L). --- ## Quick Start ```python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-wfp-food-prices-for-cameroon") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() ``` --- ## Schema | Column | Type | Null % | Range / Sample Values | |---|---|---|---| | `date` | datetime64[ns] | 0.0% | | | `admin1` | object | 0.0% | Extrême-Nord, Sud-Ouest, Nord-Ouest | | `admin2` | object | 0.0% | Fako, Logone-et-Chari, Lom-et-Djérem | | `market` | object | 0.0% | Kousseri, Mokolo, Maroua | | `market_id` | int64 | 0.0% | 1578.0 – 8559.0 (mean 4049.6727) | | `latitude` | float64 | 0.0% | 3.86 – 12.39 (mean 7.2202) | | `longitude` | float64 | 0.0% | 9.17 – 15.45 (mean 12.5697) | | `category` | object | 0.0% | cereals and tubers, pulses and nuts, vegetables and fruits | | `commodity` | object | 0.0% | Maize (white), Groundnuts (shelled), Maize (yellow) | | `commodity_id` | int64 | 0.0% | 52.0 – 1196.0 (mean 224.9709) | | `unit` | object | 0.0% | KG, 90 KG, L | | `priceflag` | object | 0.0% | actual | | `pricetype` | object | 0.0% | Retail, Wholesale | | `currency` | object | 0.0% | XAF | | `price` | float64 | 0.0% | 3.0 – 160000.0 (mean 5780.5539) | | `usdprice` | float64 | 0.0% | 0.005 – 262.3 (mean 10.1577) | | `esa_source` | object | 0.0% | HDX | | `esa_processed` | object | 0.0% | | --- ## Numeric Summary | Column | Min | Max | Mean | Median | |---|---|---|---|---| | `market_id` | 1578.0 | 8559.0 | 4049.6727 | 2613.0 | | `latitude` | 3.86 | 12.39 | 7.2202 | 6.5 | | `longitude` | 9.17 | 15.45 | 12.5697 | 13.57 | | `commodity_id` | 52.0 | 1196.0 | 224.9709 | 141.0 | | `price` | 3.0 | 160000.0 | 5780.5539 | 900.0 | | `usdprice` | 0.005 | 262.3 | 10.1577 | 1.51 | --- ## Curation Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 1 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet. --- ## Limitations - Data originates from WFP - World Food Programme and has not been independently validated by ESA. - Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection. - Refer to the [original HDX dataset page](https://data.humdata.org/dataset/wfp-food-prices-for-cameroon) for the publisher's own methodology notes and caveats. --- ## Citation ```bibtex @dataset{hdx_africa_wfp_food_prices_for_cameroon, title = {Cameroon - Food Prices}, author = {WFP - World Food Programme}, year = {2026}, url = {https://data.humdata.org/dataset/wfp-food-prices-for-cameroon}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } ``` --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*

annotations_creators: - 无注释 language_creators: - 现有资源采集 language: - en license: 知识共享署名4.0许可(CC BY 4.0) multilinguality: - 单语言 size_categories: - 10000<n<100000 source_datasets: - 原始数据集 task_categories: - 表格回归 - 其他 task_ids: [] tags: - 非洲 - 人道主义 - 人道主义数据交换(HDX) - Electric Sheep Africa - 经济学 - 粮食安全 - 指标 - 市场 - CMR pretty_name: "喀麦隆——食品价格" dataset_info: splits: - name: train num_examples: 42400 - name: test num_examples: 10600 # 喀麦隆——食品价格 **发布方:** 世界粮食计划署(World Food Programme,WFP) · **数据来源:** [人道主义数据交换(Humanitarian Data Exchange,HDX)](https://data.humdata.org/dataset/wfp-food-prices-for-cameroon) · **许可协议:** `cc-by-igo` · **更新日期:** 2026-04-05 --- ## 摘要 本数据集包含源自世界粮食计划署(World Food Programme,WFP)价格数据库的喀麦隆食品价格数据。世界粮食计划署价格数据库覆盖全球98个国家、约3000个市场的玉米、大米、豆类、鱼类及食糖等食品价格信息。该数据库每周更新,但数据主体为月度统计值。部分国家的最早数据可追溯至1992年,多数国家则自2003年及之后开始上报数据。 本数据集的每一行均代表次国家级行政单元的观测数据,时间覆盖范围由`date`(日期)列标注,地理覆盖范围为**喀麦隆(CMR)**。 *本数据集由Electric Sheep Africa团队整理为适配机器学习的Parquet格式。* --- ## 数据集特征 | | | |---|---| | **研究领域** | 粮食安全与营养 | | **观测单元** | 次国家级行政单元观测数据 | | **总数据行数** | 53000条 | | **列数** | 18列(6列数值型、11列分类型、1列日期时间型) | | **训练集划分** | 42400条数据 | | **测试集划分** | 10600条数据 | | **地理覆盖范围** | 喀麦隆(CMR) | | **发布方** | 世界粮食计划署(WFP) | | **HDX最后更新时间** | 2026年4月5日 | --- ## 变量说明 **地理类变量**:`admin1`(包含极北区、西南区、西北区)、`admin2`(包含法科区、洛贡-查里区、洛姆-杰雷姆区)、`latitude`(纬度范围3.86–12.39)、`longitude`(经度范围9.17–15.45)、`category`(涵盖谷物与块根作物、豆类与坚果、蔬菜与水果等类别)及另外4项变量。 **时间类变量**:`date`(日期)。 **结果/测量类变量**:`priceflag`(取值为`actual`,即实际观测价格)、`price`(价格,取值范围3.0–160000.0)、`usdprice`(美元计价价格,取值范围0.005–262.3)。 **标识符/元数据类变量**:`market_id`(市场ID,取值范围1578.0–8559.0)、`esa_source`(数据来源为HDX)、`esa_processed`(处理标记)。 **其他变量**:`market`(市场名称,涵盖库塞里、莫科洛、马鲁阿等)、`unit`(计价单位:KG、90 KG、L)。 --- ## 快速上手 python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-wfp-food-prices-for-cameroon") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() --- ## 数据Schema | 列名 | 数据类型 | 空值占比 | 取值范围/示例值 | |---|---|---|---| | `date` | datetime64[ns] | 0.0% | 无 | | `admin1` | object | 0.0% | 极北区、西南区、西北区 | | `admin2` | object | 0.0% | 法科区、洛贡-查里区、洛姆-杰雷姆区 | | `market` | object | 0.0% | 库塞里、莫科洛、马鲁阿 | | `market_id` | int64 | 0.0% | 1578.0 – 8559.0(均值4049.6727) | | `latitude` | float64 | 0.0% | 3.86 – 12.39(均值7.2202) | | `longitude` | float64 | 0.0% | 9.17 – 15.45(均值12.5697) | | `category` | object | 0.0% | 谷物与块根作物、豆类与坚果、蔬菜与水果 | | `commodity` | object | 0.0% | 白玉米、带壳花生、黄玉米 | | `commodity_id` | int64 | 0.0% | 52.0 – 1196.0(均值224.9709) | | `unit` | object | 0.0% | KG、90 KG、L | | `priceflag` | object | 0.0% | actual | | `pricetype` | object | 0.0% | 零售、批发 | | `currency` | object | 0.0% | XAF(中非法郎) | | `price` | float64 | 0.0% | 3.0 – 160000.0(均值5780.5539) | | `usdprice` | float64 | 0.0% | 0.005 – 262.3(均值10.1577) | | `esa_source` | object | 0.0% | HDX | | `esa_processed` | object | 0.0% | 无 | --- ## 数值型变量统计摘要 | 列名 | 最小值 | 最大值 | 均值 | 中位数 | |---|---|---|---|---| | `market_id` | 1578.0 | 8559.0 | 4049.6727 | 2613.0 | | `latitude` | 3.86 | 12.39 | 7.2202 | 6.5 | | `longitude` | 9.17 | 15.45 | 12.5697 | 13.57 | | `commodity_id` | 52.0 | 1196.0 | 224.9709 | 141.0 | | `price` | 3.0 | 160000.0 | 5780.5539 | 900.0 | | `usdprice` | 0.005 | 262.3 | 10.1577 | 1.51 | --- ## 数据整理流程 原始数据通过CKAN API从HDX平台下载,并转换为Parquet格式。列名统一转换为小写并采用蛇形命名法进行标准化。将常见缺失值标记(`N/A`、`null`、`none`、`-`、`unknown`、`no data`、`#N/A`)统一替换为`NaN`。根据解析成功率(阈值>85%),将1列从字符串类型转换为数值型或日期时间型。本数据集采用固定随机种子(42)按照80/20的比例划分为训练集与测试集,并以Snappy压缩格式保存为Parquet文件。 --- ## 数据集局限性 - 本数据集源自世界粮食计划署(WFP),未经过Electric Sheep Africa团队的独立验证。 - 自动化清洗流程无法修正原始数据集中的错报值、定义不一致问题或采样偏差。 - 如需查看发布方的方法说明与免责条款,请参阅[原始HDX数据集页面](https://data.humdata.org/dataset/wfp-food-prices-for-cameroon)。 --- ## 引用格式 bibtex @dataset{hdx_africa_wfp_food_prices_for_cameroon, title = {喀麦隆——食品价格}, author = {世界粮食计划署(World Food Programme,WFP)}, year = {2026}, url = {https://data.humdata.org/dataset/wfp-food-prices-for-cameroon}, note = {由Electric Sheep Africa团队重新打包为机器学习可用数据集(https://huggingface.co/electricsheepafrica)} } --- *Electric Sheep Africa团队——非洲机器学习数据集基础设施提供商,尼日利亚拉各斯。*
提供机构:
electricsheepafrica
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作