five

electricsheepafrica/africa-development-indicators-sierra-leone

收藏
Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-development-indicators-sierra-leone
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - no-annotation language_creators: - found language: - en license: cc-by-4.0 multilinguality: - monolingual size_categories: - 1K<n<10K source_datasets: - original task_categories: - tabular-regression task_ids: [] tags: - africa - humanitarian - hdx - electric-sheep-africa pretty_name: "Sierra Leone Development Indicators" dataset_info: splits: - name: train num_examples: 1078 - name: test num_examples: 269 --- # Sierra Leone Development Indicators **Publisher:** Code for Africa · **Source:** [OpenAfrica](https://open.africa/dataset/development-indicators-sierra-leone) · **License:** `cc-by` · **Updated:** 2023-11-30 --- ## Abstract World Development Indicators (WDI) is the primary World Bank collection of development indicators, compiled from officially recognized international sources. It presents the most current and accurate global development data available, and includes national, regional and global estimates. Each row in this dataset represents tabular records. Data was last updated on OpenAfrica on 2023-11-30. Geographic scope: **Africa (multiple countries)**. *Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).* --- ## Dataset Characteristics | | | |---|---| | **Domain** | Humanitarian and development data | | **Unit of observation** | Tabular records | | **Rows (total)** | 1,348 | | **Columns** | 51 (45 numeric, 6 categorical, 0 datetime) | | **Train split** | 1,078 rows | | **Test split** | 269 rows | | **Geographic scope** | Africa (multiple countries) | | **Publisher** | Code for Africa | | **OpenAfrica last updated** | 2023-11-30 | --- ## Variables **Identifier / Metadata** — `data_source` (Sierra Leone, Last Updated Date, Country Name), `unnamed_2` (Indicator Name, Survival rate to the last grade of primary education, male (%), Teachers in primary education, both sexes (number)), `unnamed_3` (Indicator Code, SE.PRM.PRSL.MA.ZS, SE.PRM.TCHR), `unnamed_14` (range -185391939170.98–3784006115051.5093), `unnamed_15` (range -25997887500.9746–3649698056999.9995) and 45 others. **Other** — `world_development_indicators` (SLE, 2016-02-17 00:00:00, Country Code). --- ## Quick Start ```python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-development-indicators-sierra-leone") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() ``` --- ## Schema | Column | Type | Null % | Range / Sample Values | |---|---|---|---| | `data_source` | object | 0.1% | Sierra Leone, Last Updated Date, Country Name | | `world_development_indicators` | object | 0.1% | SLE, 2016-02-17 00:00:00, Country Code | | `unnamed_2` | object | 0.1% | Indicator Name, Survival rate to the last grade of primary education, male (%), Teachers in primary education, both sexes (number) | | `unnamed_3` | object | 0.1% | Indicator Code, SE.PRM.PRSL.MA.ZS, SE.PRM.TCHR | | `unnamed_14` | float64 | 69.0% | -185391939170.98 – 3784006115051.5093 (mean 61072311945.009) | | `unnamed_15` | float64 | 66.9% | -25997887500.9746 – 3649698056999.9995 (mean 56307067750.8422) | | `unnamed_16` | float64 | 65.7% | -139425517217.9297 – 3661773722899.9995 (mean 53523741795.3648) | | `unnamed_17` | float64 | 66.0% | -105503985066.14 – 3919663954485.4414 (mean 57148847486.1194) | | `unnamed_18` | float64 | 66.2% | -157369685931.63 – 4032823443700.0 (mean 57958123262.615) | | `unnamed_19` | float64 | 65.4% | -255823167698.4884 – 3961298826700.0 (mean 54197999606.1128) | | `unnamed_20` | float64 | 65.9% | -294714377485.744 – 3922445279999.9995 (mean 54572407439.802) | | `unnamed_21` | float64 | 65.4% | -150145945950.5659 – 4044924200800.0 (mean 54573369900.7472) | | `unnamed_22` | float64 | 64.4% | -452870745278.719 – 4471577976800.0 (mean 56040143933.6679) | | `unnamed_23` | float64 | 67.3% | -724873823154.36 – 4924875611800.0 (mean 64695395132.3086) | | `unnamed_24` | float64 | 63.1% | -532657932761.525 – 4952311057500.0 (mean 77592864640.251) | | `unnamed_25` | float64 | 62.6% | -402232312019.717 – 4932543007100.0 (mean 76971069357.2273) | | `unnamed_26` | float64 | 62.3% | -294287773056.485 – 5041226850600.0 (mean 78312538544.9175) | | `unnamed_27` | float64 | 61.4% | -172473362288.215 – 4819593728100.0 (mean 72947264299.3427) | | `unnamed_28` | float64 | 60.6% | -40894155644.3002 – 4864048317700.0 (mean 73793656845.1651) | | `unnamed_29` | float64 | 61.6% | -73379399979.427 – 4653781002100.0 (mean 71939243919.7395) | | `unnamed_30` | float64 | 62.8% | -78269338378.0405 – 4682995994300.0 (mean 74266109774.1459) | | `unnamed_31` | float64 | 62.6% | -61935487099.069 – 5243210291534.68 (mean 80979671377.8831) | | `unnamed_32` | float64 | 61.6% | -174097473745.135 – 5292197601612.6875 (mean 75379777904.0332) | | `unnamed_33` | float64 | 59.6% | -199796812619.789 – 5089592479381.218 (mean 72716330066.9661) | | `unnamed_34` | float64 | 52.5% | | | `unnamed_35` | float64 | 53.0% | | | `unnamed_36` | float64 | 54.0% | | | `unnamed_37` | float64 | 54.1% | | | `unnamed_38` | float64 | 54.0% | | | `unnamed_39` | float64 | 52.4% | | | `unnamed_40` | float64 | 52.9% | | | `unnamed_41` | float64 | 52.6% | | | `unnamed_42` | float64 | 52.9% | | | `unnamed_43` | float64 | 52.4% | | | `unnamed_44` | float64 | 45.0% | | | `unnamed_45` | float64 | 46.3% | | | `unnamed_46` | float64 | 46.6% | | | `unnamed_47` | float64 | 45.6% | | | `unnamed_48` | float64 | 43.5% | | | `unnamed_49` | float64 | 37.5% | | | `unnamed_50` | float64 | 40.6% | | | `unnamed_51` | float64 | 37.0% | | | `unnamed_52` | float64 | 35.0% | | | `unnamed_53` | float64 | 37.4% | | | `unnamed_54` | float64 | 33.2% | | | `unnamed_55` | float64 | 39.9% | | | `unnamed_56` | float64 | 38.9% | | | `unnamed_57` | float64 | 42.1% | | | `unnamed_58` | float64 | 64.6% | | | `esa_source` | object | 0.0% | HDX | | `esa_processed` | object | 0.0% | 2026-04-27 | --- ## Numeric Summary | Column | Min | Max | Mean | Median | |---|---|---|---|---| | `unnamed_14` | -185391939170.98 | 3784006115051.5093 | 61072311945.009 | 157.98 | | `unnamed_15` | -25997887500.9746 | 3649698056999.9995 | 56307067750.8422 | 67.7561 | | `unnamed_16` | -139425517217.9297 | 3661773722899.9995 | 53523741795.3648 | 75.3781 | | `unnamed_17` | -105503985066.14 | 3919663954485.4414 | 57148847486.1194 | 130.0 | | `unnamed_18` | -157369685931.63 | 4032823443700.0 | 57958123262.615 | 227.6751 | | `unnamed_19` | -255823167698.4884 | 3961298826700.0 | 54197999606.1128 | 430.4706 | | `unnamed_20` | -294714377485.744 | 3922445279999.9995 | 54572407439.802 | 170.233 | | `unnamed_21` | -150145945950.5659 | 4044924200800.0 | 54573369900.7472 | 1709.322 | | `unnamed_22` | -452870745278.719 | 4471577976800.0 | 56040143933.6679 | 326.6179 | | `unnamed_23` | -724873823154.36 | 4924875611800.0 | 64695395132.3086 | 40000.0 | | `unnamed_24` | -532657932761.525 | 4952311057500.0 | 77592864640.251 | 50000.0 | | `unnamed_25` | -402232312019.717 | 4932543007100.0 | 76971069357.2273 | 28645.0 | | `unnamed_26` | -294287773056.485 | 5041226850600.0 | 78312538544.9175 | 11037.5 | | `unnamed_27` | -172473362288.215 | 4819593728100.0 | 72947264299.3427 | 460.1674 | | `unnamed_28` | -40894155644.3002 | 4864048317700.0 | 73793656845.1651 | 357.4236 | --- ## Curation Raw data was downloaded from OpenAfrica via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 11 column(s) with >80% missing values were removed: `unnamed_4`, `unnamed_5`, `unnamed_6`, `unnamed_7`, `unnamed_8`, `unnamed_9`.... The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet. --- ## Limitations - Data originates from Code for Africa and has not been independently validated by ESA. - Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection. - The following columns have >20% missing values and should be treated with caution in modelling: `unnamed_14`, `unnamed_15`, `unnamed_16`, `unnamed_17`, `unnamed_18`, `unnamed_19`, `unnamed_20`, `unnamed_21`.... - Refer to the [original HDX dataset page](https://open.africa/dataset/development-indicators-sierra-leone) for the publisher's own methodology notes and caveats. --- ## Citation ```bibtex @dataset{openafrica_africa_development_indicators_sierra_leone, title = {Sierra Leone Development Indicators}, author = {Code for Africa}, year = {2023}, url = {https://open.africa/dataset/development-indicators-sierra-leone}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } ``` --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作