five

electricsheepafrica/africa-sind-protection-in-danger-monthly-news-briefs-dataset

收藏
Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-sind-protection-in-danger-monthly-news-briefs-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - no-annotation language_creators: - found language: - en license: cc-by-sa-4.0 multilinguality: - monolingual size_categories: - 1K<n<10K source_datasets: - original task_categories: - tabular-regression - other task_ids: [] tags: - africa - humanitarian - hdx - electric-sheep-africa - aid-worker-security - crisis-opt-israel-hostilities - internally-displaced-persons-idp - populated-places-settlements - refugee-crisis - refugees - afg - bgd - blr - bih - bfa pretty_name: "Protection in Danger Data" dataset_info: splits: - name: train num_examples: 6718 - name: test num_examples: 1679 --- # Protection in Danger Data **Publisher:** Insecurity Insight · **Source:** [HDX](https://data.humdata.org/dataset/sind-protection-in-danger-monthly-news-briefs-dataset) · **License:** `cc-by-sa` · **Updated:** 2026-04-13 --- ## Abstract This page contains violent agency- and open source events affecting refugee and IDP camps published in the [Protection in Danger Monthly News Brief](https://insecurityinsight.org/projects/ensuring-protection/protection-in-danger-monthly-news-brief-2). Categorized by country. Please get in touch if you are interested in curated datasets: info@insecurityinsight.org Each row in this dataset represents country-level aggregates. Temporal coverage is indicated by the `date`, `date_event_entered` column(s). Geographic scope: **AFG, BGD, BLR, BIH, BFA, CMR, CAF, TCD, and 42 others**. *Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).* --- ## Dataset Characteristics | | | |---|---| | **Domain** | Humanitarian and development data | | **Unit of observation** | Country-level aggregates | | **Rows (total)** | 8,398 | | **Columns** | 29 (12 numeric, 14 categorical, 3 datetime) | | **Train split** | 6,718 rows | | **Test split** | 1,679 rows | | **Geographic scope** | AFG, BGD, BLR, BIH, BFA, CMR, CAF, TCD, and 42 others | | **Publisher** | Insecurity Insight | | **HDX last updated** | 2026-04-13 | --- ## Variables **Geographic** — `country` (OPT, Sudan, Syria), `country_iso` (PSE, SDN, SYR), `admin_1` (Gaza Strip, West Bank, North Darfur), `protection_event_context` (Airstrike/Shelling, Security Operation, Targetted Attack on Camp), `survivor_or_victim_sex`. **Temporal** — `date`, `date_event_entered`, `date_event_modified`. **Demographic** — `number_of_attacks_on_camps_reporting_damaged` (range 0.0–110.0). **Outcome / Measurement** — `number_of_attacks_on_camps_reporting_destruction` (range 0.0–24.0). **Identifier / Metadata** — `camp_name` (Temporary/Makeshift Site, Nuseirat Camp, Jabalia Camp), `reported_perpetrator_name` (Israeli Defence Forces, Rapid Support Forces, Armed men), `camp_resident_killed` (range 0.0–274.0), `camp_resident_injured` (range 0.0–647.0), `camp_residents_kidnapped` (range 0.0–137.0) and 9 others. **Other** — `geo_precision` (censored), `reported_perpetrator` (Host Government: Military, NSA, Multiple), `weapon_carried_used` (Firearms, Aerial Bomb: Plane, Aerial Bomb: Drone), `victim_of_violence` (Camp Resident, No Direct Victim Reported, Health Worker ), `survivor_or_victim_minor`. --- ## Quick Start ```python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-sind-protection-in-danger-monthly-news-briefs-dataset") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() ``` --- ## Schema | Column | Type | Null % | Range / Sample Values | |---|---|---|---| | `date` | datetime64[ns] | 0.0% | | | `country` | object | 0.0% | OPT, Sudan, Syria | | `country_iso` | object | 0.0% | PSE, SDN, SYR | | `admin_1` | object | 0.0% | Gaza Strip, West Bank, North Darfur | | `geo_precision` | object | 0.0% | censored | | `camp_name` | object | 4.5% | Temporary/Makeshift Site, Nuseirat Camp, Jabalia Camp | | `reported_perpetrator` | object | 0.0% | Host Government: Military, NSA, Multiple | | `reported_perpetrator_name` | object | 0.0% | Israeli Defence Forces, Rapid Support Forces, Armed men | | `weapon_carried_used` | object | 0.0% | Firearms, Aerial Bomb: Plane, Aerial Bomb: Drone | | `protection_event_context` | object | 0.7% | Airstrike/Shelling, Security Operation, Targetted Attack on Camp | | `victim_of_violence` | object | 0.1% | Camp Resident, No Direct Victim Reported, Health Worker | | `survivor_or_victim_sex` | object | 0.2% | | | `survivor_or_victim_minor` | object | 0.4% | | | `number_of_attacks_on_camps_reporting_destruction` | int64 | 0.0% | 0.0 – 24.0 (mean 0.01) | | `number_of_attacks_on_camps_reporting_damaged` | int64 | 0.0% | 0.0 – 110.0 (mean 0.5925) | | `camp_resident_killed` | int64 | 0.0% | 0.0 – 274.0 (mean 1.9402) | | `camp_resident_injured` | int64 | 0.0% | 0.0 – 647.0 (mean 1.5569) | | `camp_residents_kidnapped` | int64 | 0.0% | 0.0 – 137.0 (mean 0.0562) | | `camp_residents_arrested` | int64 | 0.0% | 0.0 – 538.0 (mean 0.6081) | | `camp_residents_targeted_with_crsv` | int64 | 0.0% | 0.0 – 32.0 (mean 0.021) | | `service_provider_killed` | int64 | 0.0% | 0.0 – 11.0 (mean 0.0246) | | `service_provider_kidnapped` | int64 | 0.0% | 0.0 – 5.0 (mean 0.0024) | | `service_provider_arrested` | int64 | 0.0% | 0.0 – 70.0 (mean 0.0196) | | `service_provider_targeted_with_crsv` | int64 | 0.0% | 0.0 – 2.0 (mean 0.0005) | | `sind_id` | int64 | 0.0% | 19018.0 – 126955.0 (mean 87731.1185) | | `date_event_entered` | datetime64[ns] | 0.0% | | | `date_event_modified` | datetime64[ns] | 0.0% | | | `esa_source` | object | 0.0% | | | `esa_processed` | object | 0.0% | | --- ## Numeric Summary | Column | Min | Max | Mean | Median | |---|---|---|---|---| | `number_of_attacks_on_camps_reporting_destruction` | 0.0 | 24.0 | 0.01 | 0.0 | | `number_of_attacks_on_camps_reporting_damaged` | 0.0 | 110.0 | 0.5925 | 1.0 | | `camp_resident_killed` | 0.0 | 274.0 | 1.9402 | 0.0 | | `camp_resident_injured` | 0.0 | 647.0 | 1.5569 | 0.0 | | `camp_residents_kidnapped` | 0.0 | 137.0 | 0.0562 | 0.0 | | `camp_residents_arrested` | 0.0 | 538.0 | 0.6081 | 0.0 | | `camp_residents_targeted_with_crsv` | 0.0 | 32.0 | 0.021 | 0.0 | | `service_provider_killed` | 0.0 | 11.0 | 0.0246 | 0.0 | | `service_provider_kidnapped` | 0.0 | 5.0 | 0.0024 | 0.0 | | `service_provider_arrested` | 0.0 | 70.0 | 0.0196 | 0.0 | | `service_provider_targeted_with_crsv` | 0.0 | 2.0 | 0.0005 | 0.0 | | `sind_id` | 19018.0 | 126955.0 | 87731.1185 | 92661.5 | --- ## Curation Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 3 column(s) with >80% missing values were removed: `event_description`, `latitude`, `longitude`. The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet. --- ## Limitations - Data originates from Insecurity Insight and has not been independently validated by ESA. - Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection. - This dataset spans 50 countries; geographic and methodological inconsistencies across national boundaries may affect cross-country comparability. - Refer to the [original HDX dataset page](https://data.humdata.org/dataset/sind-protection-in-danger-monthly-news-briefs-dataset) for the publisher's own methodology notes and caveats. --- ## Citation ```bibtex @dataset{hdx_africa_sind_protection_in_danger_monthly_news_briefs_dataset, title = {Protection in Danger Data}, author = {Insecurity Insight}, year = {2026}, url = {https://data.humdata.org/dataset/sind-protection-in-danger-monthly-news-briefs-dataset}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } ``` --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
二维码
社区交流群
二维码
科研交流群
商业服务