five

electricsheepafrica/africa-hdro-data-for-mozambique

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-hdro-data-for-mozambique
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - no-annotation language_creators: - found language: - en license: cc-by-4.0 multilinguality: - monolingual size_categories: - n<1K source_datasets: - original task_categories: - tabular-classification - tabular-regression task_ids: [] tags: - africa - humanitarian - hdx - electric-sheep-africa - demographics - development - education - gender - health - indicators - socioeconomics - moz pretty_name: "Mozambique - Human Development Indicators" dataset_info: splits: - name: train num_examples: 750 - name: test num_examples: 187 --- # Mozambique - Human Development Indicators **Publisher:** UNDP Human Development Reports Office (HDRO) · **Source:** [HDX](https://data.humdata.org/dataset/hdro-data-for-mozambique) · **License:** `cc-by-igo` · **Updated:** 2026-03-04 --- ## Abstract The aim of the Human Development Report is to stimulate global, regional and national policy-relevant discussions on issues pertinent to human development. Accordingly, the data in the Report require the highest standards of data quality, consistency, international comparability and transparency. The Human Development Report Office (HDRO) fully subscribes to the Principles governing international statistical activities. The HDI was created to emphasize that people and their capabilities should be the ultimate criteria for assessing the development of a country, not economic growth alone. The HDI can also be used to question national policy choices, asking how two countries with the same level of GNI per capita can end up with different human development outcomes. These contrasts can stimulate debate about government policy priorities. The Human Development Index (HDI) is a summary measure of average achievement in key dimensions of human development: a long and healthy life, being knowledgeable and have a decent standard of living. The HDI is the geometric mean of normalized indices for each of the three dimensions. The 2019 Global Multidimensional Poverty Index (MPI) data shed light on the number of people experiencing poverty at regional, national and subnational levels, and reveal inequalities across countries and among the poor themselves.Jointly developed by the United Nations Development Programme (UNDP) and the Oxford Poverty and Human Development Initiative (OPHI) at the University of Oxford, the 2019 global MPI offers data for 101 countries, covering 76 percent of the global population. The MPI provides a comprehensive and in-depth picture of global poverty – in all its dimensions – and monitors progress towards Sustainable Development Goal (SDG) 1 – to end poverty in all its forms. It also provides policymakers with the data to respond to the call of Target 1.2, which is to ‘reduce at least by half the proportion of men, women, and children of all ages living in poverty in all its dimensions according to national definition'. Each row in this dataset represents country-level aggregates. Data was last updated on HDX on 2026-03-04. Geographic scope: **MOZ**. *Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).* --- ## Dataset Characteristics | | | |---|---| | **Domain** | Public health | | **Unit of observation** | Country-level aggregates | | **Rows (total)** | 938 | | **Columns** | 10 (2 numeric, 8 categorical, 0 datetime) | | **Train split** | 750 rows | | **Test split** | 187 rows | | **Geographic scope** | MOZ | | **Publisher** | UNDP Human Development Reports Office (HDRO) | | **HDX last updated** | 2026-03-04 | --- ## Variables **Geographic** — `country_code` (MOZ), `country_name` (Mozambique), `index_id` (GDI, GII, HDI), `index_name` (Gender Development Index, Gender Inequality Index, Human Development Index), `year` (range 1990.0–2023.0). **Outcome / Measurement** — `value` (range 0.066–1674.917). **Identifier / Metadata** — `indicator_id` (eys, pop_total, mys_f), `indicator_name` (Expected Years of Schooling (years), Population, total (millions), Mean Years of Schooling, female (years)), `esa_source` (HDX), `esa_processed` (2026-04-06). --- ## Quick Start ```python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-hdro-data-for-mozambique") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() ``` --- ## Schema | Column | Type | Null % | Range / Sample Values | |---|---|---|---| | `country_code` | object | 0.0% | MOZ | | `country_name` | object | 0.0% | Mozambique | | `indicator_id` | object | 0.0% | eys, pop_total, mys_f | | `indicator_name` | object | 0.0% | Expected Years of Schooling (years), Population, total (millions), Mean Years of Schooling, female (years) | | `index_id` | object | 0.0% | GDI, GII, HDI | | `index_name` | object | 0.0% | Gender Development Index, Gender Inequality Index, Human Development Index | | `value` | float64 | 0.0% | 0.066 – 1674.917 (mean 147.5328) | | `year` | int64 | 0.0% | 1990.0 – 2023.0 (mean 2007.9979) | | `esa_source` | object | 0.0% | HDX | | `esa_processed` | object | 0.0% | 2026-04-06 | --- ## Numeric Summary | Column | Min | Max | Mean | Median | |---|---|---|---|---| | `value` | 0.066 | 1674.917 | 147.5328 | 25.126 | | `year` | 1990.0 | 2023.0 | 2007.9979 | 2009.0 | --- ## Curation Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet. --- ## Limitations - Data originates from UNDP Human Development Reports Office (HDRO) and has not been independently validated by ESA. - Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection. - Refer to the [original HDX dataset page](https://data.humdata.org/dataset/hdro-data-for-mozambique) for the publisher's own methodology notes and caveats. --- ## Citation ```bibtex @dataset{hdx_africa_hdro_data_for_mozambique, title = {Mozambique - Human Development Indicators}, author = {UNDP Human Development Reports Office (HDRO)}, year = {2026}, url = {https://data.humdata.org/dataset/hdro-data-for-mozambique}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } ``` --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*

annotations_creators: - 无注释 language_creators: - 现有语料 language: - 英语 license: CC BY 4.0 multilinguality: - 单语言 size_categories: - 样本量小于1000 source_datasets: - 原始数据集 task_categories: - 表格分类 - 表格回归 task_ids: [] tags: - 非洲 - 人道主义 - HDX - 电子绵羊非洲(electric-sheep-africa) - 人口统计学 - 发展 - 教育 - 性别 - 健康 - 指标 - 社会经济学 - 莫桑比克 pretty_name: "莫桑比克——人类发展指标" dataset_info: splits: - name: train num_examples: 750 - name: test num_examples: 187 # 莫桑比克——人类发展指标 **发布方:** 联合国开发计划署人类发展报告办公室(HDRO) · **来源:** [人道主义数据交换平台(HDX)](https://data.humdata.org/dataset/hdro-data-for-mozambique) · **许可:** `cc-by-igo` · **更新时间:** 2026-03-04 --- ## 摘要 人类发展报告的宗旨是推动全球、区域与国家层面围绕与人类发展相关的政策议题开展有意义的讨论。据此,报告中的数据需满足最高标准的数据质量、一致性、国际可比性与透明度要求。人类发展报告办公室(HDRO)完全遵循国际统计活动治理原则。 人类发展指数(HDI)旨在强调,评判一国发展水平的终极标准应当是人民及其能力,而非仅以经济增长作为唯一依据。人类发展指数还可用于审视国家政策选择:为何两个人均国民总收入水平相当的国家,最终的人类发展结果却存在显著差异?这类对比能够激发关于政府政策优先级的讨论。 人类发展指数(HDI)是对人类发展三大核心维度平均成就的综合衡量指标:健康长寿的生活、拥有充足知识以及享有体面的生活水平。HDI为三大维度各自标准化后的指数的几何平均值。 2019年全球多维贫困指数(MPI)数据揭示了区域、国家及国家以下层级的贫困人口规模,并展现了国家间以及贫困人口内部的不平等状况。该指数由联合国开发计划署(UNDP)与牛津大学牛津贫困与人类发展倡议(OPHI)联合开发,2019年全球MPI覆盖101个国家,惠及全球76%的人口。 MPI提供了全球贫困多维度的全面深入图景,并可用于监测可持续发展目标(SDG)1的进展——即消除一切形式的贫困。同时,它还能为政策制定者提供数据支撑,以响应目标1.2的号召:“按照本国定义,将各年龄段处于一切形式贫困中的男性、女性和儿童的比例至少降低一半”。 本数据集的每一行均代表国家层面的汇总数据。该数据集最后一次在人道主义数据交换平台(HDX)更新的时间为2026-03-04。地理覆盖范围:**MOZ**。 *本数据集已由[电子绵羊非洲(electric-sheep-africa)](https://huggingface.co/electricsheepafrica)整理为机器学习可用的Parquet格式。* --- ## 数据集特征 | | | |---|---| | **领域** | 公共卫生 | | **观测单元** | 国家层面汇总数据 | | **总样本行数** | 938 | | **列数** | 10(2个数值型,8个分类型,0个日期时间型) | | **训练集拆分** | 750行 | | **测试集拆分** | 187行 | | **地理覆盖范围** | MOZ | | **发布方** | 联合国开发计划署人类发展报告办公室(HDRO) | | **HDX最后更新时间** | 2026-03-04 | --- ## 变量 **地理相关** — `country_code`(国家代码,取值为MOZ)、`country_name`(国家名称,取值为莫桑比克)、`index_id`(指数代码,取值为GDI、GII、HDI)、`index_name`(指数名称,包括性别发展指数、性别不平等指数、人类发展指数)、`year`(年份,范围为1990.0–2023.0)。 **结果/测量指标** — `value`(指标数值,范围为0.066–1674.917)。 **标识符/元数据** — `indicator_id`(指标代码,取值为eys、pop_total、mys_f)、`indicator_name`(指标名称,包括预期受教育年限(年)、总人口(百万)、女性平均受教育年限(年))、`esa_source`(数据来源,取值为HDX)、`esa_processed`(数据整理时间,取值为2026-04-06)。 --- ## 快速上手 python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-hdro-data-for-mozambique") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() --- ## 数据结构 | 列名 | 数据类型 | 空值占比 | 取值范围/示例值 | |---|---|---|---| | `country_code` | 字符串型 | 0.0% | MOZ | | `country_name` | 字符串型 | 0.0% | 莫桑比克 | | `indicator_id` | 字符串型 | 0.0% | eys、pop_total、mys_f | | `indicator_name` | 字符串型 | 0.0% | 预期受教育年限(年)、总人口(百万)、女性平均受教育年限(年) | | `index_id` | 字符串型 | 0.0% | GDI、GII、HDI | | `index_name` | 字符串型 | 0.0% | 性别发展指数、性别不平等指数、人类发展指数 | | `value` | 浮点型 | 0.0% | 0.066 – 1674.917(均值147.5328) | | `year` | 整型 | 0.0% | 1990.0 – 2023.0(均值2007.9979) | | `esa_source` | 字符串型 | 0.0% | HDX | | `esa_processed` | 字符串型 | 0.0% | 2026-04-06 | --- ## 数值型统计摘要 | 列名 | 最小值 | 最大值 | 均值 | 中位数 | |---|---|---|---|---| | `value` | 0.066 | 1674.917 | 147.5328 | 25.126 | | `year` | 1990.0 | 2023.0 | 2007.9979 | 2009.0 | --- ## 数据整理流程 原始数据通过CKAN API从HDX平台下载,并转换为Parquet格式。列名统一转换为小写并采用蛇形命名法进行规范。将常见的缺失值标记(`N/A`、`null`、`none`、`-`、`unknown`、`no data`、`#N/A`)统一替换为`NaN`。本数据集采用固定随机种子(42)按80/20比例划分为训练集与测试集,并以Snappy压缩格式保存为Parquet文件。 --- ## 局限性 - 本数据源自联合国开发计划署人类发展报告办公室(HDRO),未经电子绵羊非洲(ESA)独立验证。 - 自动化清洗无法修正原始数据收集中的错报值、定义不一致或抽样偏差问题。 - 如需查看发布方的方法说明与免责声明,请参阅[原始HDX数据集页面](https://data.humdata.org/dataset/hdro-data-for-mozambique)。 --- ## 引用 bibtex @dataset{hdx_africa_hdro_data_for_mozambique, title = {莫桑比克——人类发展指标}, author = {联合国开发计划署人类发展报告办公室(HDRO)}, year = {2026}, url = {https://data.humdata.org/dataset/hdro-data-for-mozambique}, note = {由电子绵羊非洲(https://huggingface.co/electricsheepafrica)重新打包以适配机器学习任务} } --- *[电子绵羊非洲(electric-sheep-africa)](https://huggingface.co/electricsheepafrica)——非洲机器学习数据集基础设施。尼日利亚拉各斯。*
提供机构:
electricsheepafrica
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作