five

electricsheepafrica/africa-world-bank-private-sector-indicators-for-south-sudan

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-world-bank-private-sector-indicators-for-south-sudan
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - no-annotation language_creators: - found language: - en license: cc-by-4.0 multilinguality: - monolingual size_categories: - n<1K source_datasets: - original task_categories: - tabular-regression task_ids: [] tags: - africa - humanitarian - hdx - electric-sheep-africa - economics - indicators - ssd pretty_name: "South Sudan - Private Sector" dataset_info: splits: - name: train num_examples: 407 - name: test num_examples: 101 --- # South Sudan - Private Sector **Publisher:** World Bank Group · **Source:** [HDX](https://data.humdata.org/dataset/world-bank-private-sector-indicators-for-south-sudan) · **License:** `cc-by` · **Updated:** 2026-03-27 --- ## Abstract Contains data from the World Bank's [data portal](http://data.worldbank.org/). There is also a [consolidated country dataset](https://data.humdata.org/dataset/world-bank-combined-indicators-for-south-sudan) on HDX. Private markets drive economic growth, tapping initiative and investment to create productive jobs and raise incomes. Trade is also a driver of economic growth as it integrates developing countries into the world economy and generates benefits for their people. Data on the private sector and trade are from the World Bank Group's Private Participation in Infrastructure Project Database, Enterprise Surveys, and Doing Business Indicators, as well as from the International Monetary Fund's Balance of Payments database and International Financial Statistics, the UN Commission on Trade and Development, the World Trade Organization, and various other sources. Each row in this dataset represents country-level aggregates. Data was last updated on HDX on 2026-03-27. Geographic scope: **SSD**. *Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).* --- ## Dataset Characteristics | | | |---|---| | **Domain** | Market and price monitoring | | **Unit of observation** | Country-level aggregates | | **Rows (total)** | 509 | | **Columns** | 8 (2 numeric, 6 categorical, 0 datetime) | | **Train split** | 407 rows | | **Test split** | 101 rows | | **Geographic scope** | SSD | | **Publisher** | World Bank Group | | **HDX last updated** | 2026-03-27 | --- ## Variables **Geographic** — `country_name` (South Sudan), `country_iso3` (SSD), `year` (range 2006.0–2024.0). **Outcome / Measurement** — `value` (range -0.0–4085405597.0). **Identifier / Metadata** — `indicator_name` (Domestic credit to private sector (% of GDP), Merchandise imports from low- and middle-income economies in East Asia & Pacific (% of total merchandise imports), Merchandise exports to low- and middle-income economies in Sub-Saharan Africa (% of total merchandise exports)), `indicator_code` (FS.AST.PRVT.GD.ZS, TM.VAL.MRCH.R1.ZS, TX.VAL.MRCH.R6.ZS), `esa_source` (HDX), `esa_processed` (2026-04-10). --- ## Quick Start ```python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-world-bank-private-sector-indicators-for-south-sudan") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() ``` --- ## Schema | Column | Type | Null % | Range / Sample Values | |---|---|---|---| | `country_name` | object | 0.0% | South Sudan | | `country_iso3` | object | 0.0% | SSD | | `year` | int64 | 0.0% | 2006.0 – 2024.0 (mean 2017.8251) | | `indicator_name` | object | 0.0% | Domestic credit to private sector (% of GDP), Merchandise imports from low- and middle-income economies in East Asia & Pacific (% of total merchandise imports), Merchandise exports to low- and middle-income economies in Sub-Saharan Africa (% of total merchandise exports) | | `indicator_code` | object | 0.0% | FS.AST.PRVT.GD.ZS, TM.VAL.MRCH.R1.ZS, TX.VAL.MRCH.R6.ZS | | `value` | float64 | 0.0% | -0.0 – 4085405597.0 (mean 109672118.8134) | | `esa_source` | object | 0.0% | HDX | | `esa_processed` | object | 0.0% | 2026-04-10 | --- ## Numeric Summary | Column | Min | Max | Mean | Median | |---|---|---|---|---| | `year` | 2006.0 | 2024.0 | 2017.8251 | 2018.0 | | `value` | -0.0 | 4085405597.0 | 109672118.8134 | 13.1684 | --- ## Curation Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet. --- ## Limitations - Data originates from World Bank Group and has not been independently validated by ESA. - Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection. - Refer to the [original HDX dataset page](https://data.humdata.org/dataset/world-bank-private-sector-indicators-for-south-sudan) for the publisher's own methodology notes and caveats. --- ## Citation ```bibtex @dataset{hdx_africa_world_bank_private_sector_indicators_for_south_sudan, title = {South Sudan - Private Sector}, author = {World Bank Group}, year = {2026}, url = {https://data.humdata.org/dataset/world-bank-private-sector-indicators-for-south-sudan}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } ``` --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
搜集汇总
数据集介绍
main_image_url
构建方式
在经济学与发展研究领域,数据集的构建往往依赖于权威国际机构的系统性数据收集。本数据集源自世界银行集团,通过其数据门户及多个专业数据库,如私人参与基础设施项目数据库、企业调查和营商环境指标等,整合了关于私营部门与贸易的宏观经济指标。原始数据经由人道主义数据交换平台发布,并由Electric Sheep Africa团队进行标准化处理,包括通过CKAN API下载、统一缺失值标记、转换为Parquet格式,并采用固定随机种子按80/20比例划分为训练集与测试集,最终形成包含509条国家层面聚合记录的机器学习就绪数据集。
特点
该数据集聚焦于南苏丹的私营部门发展,其核心特征体现在结构化与专业化设计上。数据集涵盖2006年至2024年的时序数据,包含三个关键指标:国内私营部门信贷占GDP比重、来自东亚太平洋中低收入经济体的商品进口占比,以及出口至撒哈拉以南非洲中低收入经济体的商品出口占比。每条记录均附有标准化的指标名称与代码,确保了数据的可追溯性与一致性。数据集规模适中,共509行、8列,已预先分割为训练集与测试集,且所有字段均无缺失值,为回归分析等机器学习任务提供了清晰、完整的表格数据基础。
使用方法
对于从事发展经济学或区域研究的分析者而言,该数据集可直接用于构建预测模型或趋势分析。用户可通过Hugging Face的datasets库便捷加载数据,利用Python环境将数据转换为Pandas DataFrame进行操作。数据集适用于监督学习任务,特别是基于年份与指标特征对数值型‘value’变量进行回归预测。研究者可依据指标代码深入探究特定经济维度,或结合地理标识进行跨国比较。需要注意的是,应用时应参考世界银行原始方法论说明,并理解数据可能存在报告偏差或定义不一致的固有局限。
背景与挑战
背景概述
在经济发展研究领域,私营部门与贸易活动被视为驱动增长的核心引擎,尤其对于南苏丹这类新兴经济体而言,其动态监测与量化分析具有至关重要的政策参考价值。世界银行集团作为国际权威经济数据机构,于2026年发布了南苏丹私营部门指标数据集,系统整合了基础设施私人参与、企业调查及营商环境等多源数据,旨在刻画该国私营部门发展轨迹与贸易格局。该数据集由Electric Sheep Africa团队进行机器学习适配化处理,以结构化表格形式呈现2006年至2024年的国家级聚合指标,为发展经济学研究提供了高颗粒度的实证基础。
当前挑战
该数据集致力于解决发展经济学中私营部门效能评估与贸易网络分析的量化难题,其核心挑战在于如何通过有限指标准确捕捉脆弱经济体的复杂市场动态。在构建过程中,原始数据存在报告不一致性与定义差异,例如不同国际组织采用的统计口径可能产生系统性偏差。自动化清洗流程虽能统一缺失值标记,却难以修正源数据固有的抽样偏差或误报问题,且南苏丹特殊政治经济环境导致部分年份数据存在断裂,这对构建连续时间序列分析模型构成显著障碍。
常用场景
经典使用场景
在非洲经济发展研究领域,该数据集作为南苏丹私营部门与贸易活动的关键量化记录,常被用于构建时间序列分析模型。研究者通过整合国内信贷占GDP比重、商品进出口结构等指标,系统评估私营部门在驱动经济增长中的动态作用。这类分析有助于揭示南苏丹经济转型过程中市场机制的演变轨迹,为理解后冲突国家经济复苏提供实证基础。
衍生相关工作
基于该数据集衍生的经典研究包括世界银行发布的《南苏丹经济备忘录》系列报告,其中系统分析了私营部门对就业创造的贡献。学术界则利用该数据构建了南苏丹经济复杂性指数模型,探讨其贸易结构与经济多样化的关联。此外,多项机器学习研究将其作为特征工程的基础,预测宏观经济指标在冲突后环境中的恢复轨迹。
数据集最近研究
最新研究方向
在非洲经济与可持续发展研究领域,南苏丹私营部门指标数据集正成为探索脆弱国家经济韧性的关键资源。前沿研究聚焦于利用机器学习模型分析私营信贷与贸易流动的动态关联,旨在揭示冲突后经济体的复苏路径。结合人道主义数据交换平台的热点事件,学者们正整合多源指标构建预测框架,评估外部冲击对私营部门活力的影响。这类研究不仅深化了对低收入国家增长机制的理解,也为国际组织的精准干预提供了数据驱动的决策依据。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务