electricsheepafrica/africa-ethiopia-attacks-on-aid-operations-education-health-and-protection
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-ethiopia-attacks-on-aid-operations-education-health-and-protection
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- no-annotation
language_creators:
- found
language:
- en
license: cc-by-sa-4.0
multilinguality:
- monolingual
size_categories:
- n<1K
source_datasets:
- original
task_categories:
- tabular-classification
- tabular-regression
- other
task_ids: []
tags:
- africa
- humanitarian
- hdx
- electric-sheep-africa
- aid-worker-security
- aid-workers
- complex-emergency-conflict-security
- conflict-violence
- damage-assessment
- disease
- education
- education-facilities-schools
- eth
pretty_name: "Ethiopia (ETH): Attacks on Aid Operations, Education, Health Care and IDP/Refugee Camps, and Conflict-Related Sexual Violence and Explosive Weapons Incident Data"
dataset_info:
splits:
- name: train
num_examples: 93
- name: test
num_examples: 23
---
# Ethiopia (ETH): Attacks on Aid Operations, Education, Health Care and IDP/Refugee Camps, and Conflict-Related Sexual Violence and Explosive Weapons Incident Data
**Publisher:** Insecurity Insight · **Source:** [HDX](https://data.humdata.org/dataset/ethiopia-attacks-on-aid-operations-education-health-and-protection) · **License:** `cc-by-sa` · **Updated:** 2026-04-06
---
## Abstract
This page contains information on reported incidents of violence and threats affecting aid operations and workers, education, health care services and refugee and IDP camps in [Ethiopia](https://insecurityinsight.org/country-pages/ethiopia). They also provide information on incidents of conflict related sexual violence (CRSV) and explosive weapons use affecting aid access, education and health care services. Also included are datasets cited in the [Safeguarding Health in Conflict Coalition (SHCC)'s](https://www.safeguardinghealth.org/) annual reports. Please get in touch if you are interested in curated datasets: info@insecurityinsight.org
Each row in this dataset represents discrete events or incidents. Temporal coverage is indicated by the `date`, `date_event_entered` column(s). Geographic scope: **ETH**.
*Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).*
---
## Dataset Characteristics
| | |
|---|---|
| **Domain** | Public health |
| **Unit of observation** | Discrete events or incidents |
| **Rows (total)** | 117 |
| **Columns** | 42 (26 numeric, 13 categorical, 3 datetime) |
| **Train split** | 93 rows |
| **Test split** | 23 rows |
| **Geographic scope** | ETH |
| **Publisher** | Insecurity Insight |
| **HDX last updated** | 2026-04-06 |
---
## Variables
**Geographic** — `country` (Ethiopia), `country_iso` (ETH), `admin_1` (Tigray, Amhara Region, No Information), `location_of_incident` (No information, Road, Compound or Office Building), `aid_workers_killed_in_captivity` (range 0.0–4.0) and 4 others.
**Temporal** — `date`, `date_event_entered`, `date_event_modified`.
**Demographic** — `female_aid_workers_killed` (range 0.0–1.0), `male_aid_workers_killed` (range 0.0–2.0), `female_aid_workers_injured` (range 0.0–1.0), `male_aid_workers_injured` (range 0.0–1.0), `female_aid_workers_kidnapped` (range 0.0–1.0) and 3 others.
**Outcome / Measurement** — `organisation_affected` (INGO, LNGO, UN Agency).
**Identifier / Metadata** — `reported_perpetrator_name` (Unidentified armed actor, Ethiopian Police, Criminal), `aid_workers_killed` (range 0.0–6.0), `aid_workers_injured` (range 0.0–6.0), `aid_workers_kidnapped` (range 0.0–9.0), `aid_workers_arrested` (range 0.0–72.0) and 12 others.
**Other** — `geo_precision` (censored), `reported_perpetrator` (No Information, NSA, Police), `weapon_carried_used` (Firearms, No Information on the Weapon Used, Unarmed Perpetrator), `programme_focus` (No information, Multiple, Health).
---
## Quick Start
```python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-ethiopia-attacks-on-aid-operations-education-health-and-protection")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
```
---
## Schema
| Column | Type | Null % | Range / Sample Values |
|---|---|---|---|
| `date` | datetime64[ns] | 0.0% | |
| `country` | object | 0.0% | Ethiopia |
| `country_iso` | object | 0.0% | ETH |
| `admin_1` | object | 0.0% | Tigray, Amhara Region, No Information |
| `geo_precision` | object | 0.0% | censored |
| `location_of_incident` | object | 0.0% | No information, Road, Compound or Office Building |
| `reported_perpetrator` | object | 0.0% | No Information, NSA, Police |
| `reported_perpetrator_name` | object | 0.0% | Unidentified armed actor, Ethiopian Police, Criminal |
| `weapon_carried_used` | object | 0.0% | Firearms, No Information on the Weapon Used, Unarmed Perpetrator |
| `organisation_affected` | object | 0.0% | INGO, LNGO, UN Agency |
| `programme_focus` | object | 0.0% | No information, Multiple, Health |
| `aid_workers_killed` | int64 | 0.0% | 0.0 – 6.0 (mean 0.547) |
| `aid_workers_injured` | int64 | 0.0% | 0.0 – 6.0 (mean 0.4359) |
| `aid_workers_kidnapped` | int64 | 0.0% | 0.0 – 9.0 (mean 0.4274) |
| `aid_workers_arrested` | int64 | 0.0% | 0.0 – 72.0 (mean 1.3675) |
| `known_kidnapping_or_arrest_outcome` | object | 57.3% | |
| `aid_workers_killed_in_captivity` | int64 | 0.0% | 0.0 – 4.0 (mean 0.0769) |
| `international_aid_workers_killed` | int64 | 0.0% | 0.0 – 2.0 (mean 0.0342) |
| `international_aid_workers_killed_in_captivity` | int64 | 0.0% | 0.0 – 0.0 (mean 0.0) |
| `national_aid_workers_killed` | int64 | 0.0% | 0.0 – 6.0 (mean 0.4957) |
| `national_aid_workers_killed_in_captivity` | int64 | 0.0% | 0.0 – 4.0 (mean 0.0769) |
| `female_aid_workers_killed` | int64 | 0.0% | 0.0 – 1.0 (mean 0.0256) |
| `female_aid_workers_killed_in_captivity` | int64 | 0.0% | 0.0 – 0.0 (mean 0.0) |
| `male_aid_workers_killed` | int64 | 0.0% | 0.0 – 2.0 (mean 0.2735) |
| `male_aid_workers_killed_in_captivity` | int64 | 0.0% | 0.0 – 2.0 (mean 0.0342) |
| `international_aid_workers_injured` | int64 | 0.0% | 0.0 – 1.0 (mean 0.0427) |
| `national_aid_workers_injured` | int64 | 0.0% | 0.0 – 6.0 (mean 0.2991) |
| `female_aid_workers_injured` | int64 | 0.0% | 0.0 – 1.0 (mean 0.0256) |
| `male_aid_workers_injured` | int64 | 0.0% | 0.0 – 1.0 (mean 0.1453) |
| `international_aid_workers_kidnapped` | int64 | 0.0% | 0.0 – 2.0 (mean 0.0769) |
| `national_aid_workers_kidnapped` | int64 | 0.0% | 0.0 – 9.0 (mean 0.3248) |
| `female_aid_workers_kidnapped` | int64 | 0.0% | 0.0 – 1.0 (mean 0.0085) |
| `male_aid_workers_kidnapped` | int64 | 0.0% | |
| `international_aid_workers_arrested` | int64 | 0.0% | |
| `national_aid_workers_arrested` | int64 | 0.0% | |
| `female_aid_workers_arrested` | int64 | 0.0% | |
| `male_aid_workers_arrested` | int64 | 0.0% | |
| `sind_event_id` | int64 | 0.0% | |
| `date_event_entered` | datetime64[ns] | 0.0% | |
| `date_event_modified` | datetime64[ns] | 0.0% | |
| `esa_source` | object | 0.0% | |
| `esa_processed` | object | 0.0% | |
---
## Numeric Summary
| Column | Min | Max | Mean | Median |
|---|---|---|---|---|
| `aid_workers_killed` | 0.0 | 6.0 | 0.547 | 0.0 |
| `aid_workers_injured` | 0.0 | 6.0 | 0.4359 | 0.0 |
| `aid_workers_kidnapped` | 0.0 | 9.0 | 0.4274 | 0.0 |
| `aid_workers_arrested` | 0.0 | 72.0 | 1.3675 | 0.0 |
| `aid_workers_killed_in_captivity` | 0.0 | 4.0 | 0.0769 | 0.0 |
| `international_aid_workers_killed` | 0.0 | 2.0 | 0.0342 | 0.0 |
| `international_aid_workers_killed_in_captivity` | 0.0 | 0.0 | 0.0 | 0.0 |
| `national_aid_workers_killed` | 0.0 | 6.0 | 0.4957 | 0.0 |
| `national_aid_workers_killed_in_captivity` | 0.0 | 4.0 | 0.0769 | 0.0 |
| `female_aid_workers_killed` | 0.0 | 1.0 | 0.0256 | 0.0 |
| `female_aid_workers_killed_in_captivity` | 0.0 | 0.0 | 0.0 | 0.0 |
| `male_aid_workers_killed` | 0.0 | 2.0 | 0.2735 | 0.0 |
| `male_aid_workers_killed_in_captivity` | 0.0 | 2.0 | 0.0342 | 0.0 |
| `international_aid_workers_injured` | 0.0 | 1.0 | 0.0427 | 0.0 |
| `national_aid_workers_injured` | 0.0 | 6.0 | 0.2991 | 0.0 |
---
## Curation
Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 3 column(s) with >80% missing values were removed: `event_description`, `latitude`, `longitude`. The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet.
---
## Limitations
- Data originates from Insecurity Insight and has not been independently validated by ESA.
- Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection.
- The following columns have >20% missing values and should be treated with caution in modelling: `known_kidnapping_or_arrest_outcome`.
- Refer to the [original HDX dataset page](https://data.humdata.org/dataset/ethiopia-attacks-on-aid-operations-education-health-and-protection) for the publisher's own methodology notes and caveats.
---
## Citation
```bibtex
@dataset{hdx_africa_ethiopia_attacks_on_aid_operations_education_health_and_protection,
title = {Ethiopia (ETH): Attacks on Aid Operations, Education, Health Care and IDP/Refugee Camps, and Conflict-Related Sexual Violence and Explosive Weapons Incident Data},
author = {Insecurity Insight},
year = {2026},
url = {https://data.humdata.org/dataset/ethiopia-attacks-on-aid-operations-education-health-and-protection},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
```
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
搜集汇总
数据集介绍

构建方式
在冲突与安全研究领域,数据集的构建往往依赖于权威机构的系统性事件记录。本数据集由Insecurity Insight基于人道主义数据交换平台(HDX)发布的原始报告整理而成,涵盖了埃塞俄比亚境内针对援助行动、教育、医疗设施以及流离失所者营地的暴力事件。原始数据通过CKAN API获取后,经过Electric Sheep Africa进行标准化清洗与转换,包括统一缺失值标记、规范列名格式,并移除了缺失率过高的字段。最终数据被划分为训练集与测试集,以Snappy压缩的Parquet格式存储,确保了数据的机器可读性与结构一致性。
特点
该数据集以事件为观测单元,共包含117条记录与42个变量,细致刻画了冲突事件的多维特征。其地理范围限定于埃塞俄比亚,涵盖了从提格雷到阿姆哈拉等多个行政区域。变量设计兼具时空维度与人口统计细节,例如事件日期、地点、施害者信息,以及援助人员伤亡、绑架、逮捕等量化指标。数据中同时包含了性别、国籍等分类信息,为分析冲突中的人口脆弱性提供了细致视角。尽管规模较小,但其结构清晰、字段丰富,适合用于探索性分析与预测建模。
使用方法
在机器学习应用中,该数据集适用于表格分类与回归任务,例如预测事件严重性或识别高风险区域。用户可通过Hugging Face的datasets库直接加载,并转换为Pandas DataFrame进行后续处理。鉴于部分字段存在较高缺失率,建议在建模前进行谨慎的缺失值处理或特征选择。数据集已预设训练与测试划分,便于模型评估与验证。研究者可结合地理与时间变量,构建时空分析模型,或利用分类变量探究事件模式与影响因素,以支持人道主义响应与安全策略的制定。
背景与挑战
背景概述
在复杂紧急冲突与人道主义行动安全研究领域,针对援助人员及关键服务设施遭受暴力侵害的系统性数据记录至关重要。由Insecurity Insight机构创建并于2026年发布的埃塞俄比亚袭击事件数据集,聚焦于该国境内针对援助行动、教育医疗设施、难民营的暴力事件及冲突相关性暴力问题。该数据集通过记录事件时空特征、伤亡人数、施害者信息等42个变量,为量化分析冲突环境下人道主义危机提供了结构化基础,其数据源自人道主义数据交换平台并由Electric Sheep Africa团队转化为机器学习可用格式,旨在支持公共安全与冲突预防的实证研究。
当前挑战
该数据集致力于解决冲突地区暴力事件监测与风险评估的领域挑战,其核心在于从稀疏且非结构化的实地报告中提取标准化事件特征,以支持模式识别与预测建模。构建过程中面临多重挑战:原始数据存在报告偏差与地理信息缺失,如超过57%的绑架逮捕结果字段为空值;事件描述与经纬度坐标因敏感信息遮蔽而被移除,导致时空分析精度受限;同时,数据收集依赖非政府组织报告,可能存在覆盖范围不均与定义不一致问题,需结合领域知识谨慎处理分类变量中的‘无信息’条目以保障模型可靠性。
常用场景
经典使用场景
在冲突与安全研究领域,该数据集为分析埃塞俄比亚境内针对人道主义行动、教育和医疗设施的暴力事件提供了结构化数据支撑。研究者通常利用其时间序列和地理分布特征,探究袭击事件的时空演化规律,识别高风险区域与时段,进而评估冲突动态对公共服务供给的持续性影响。这类分析有助于揭示武装冲突中非战斗人员保护机制的脆弱环节。
解决学术问题
该数据集有效解决了冲突研究中定量证据匮乏的难题,为检验关于暴力扩散、目标选择与冲突升级的理论假设提供了实证基础。通过整合伤亡、绑架、逮捕等多维度指标,学者能够系统考察袭击事件的组织特征、武器使用模式与 perpetrator 身份之间的关联,深化对复杂紧急状态下安全威胁形成机制的理解,推动人道主义保护政策的证据导向转型。
衍生相关工作
基于该数据集衍生的经典研究包括冲突事件预测模型的构建,例如利用时空特征与 perpetrator 属性训练分类算法以预警潜在袭击。相关学术工作进一步拓展至跨区域比较分析,探讨埃塞俄比亚案例与非洲其他冲突地带的异同。亦有研究将其与卫星影像、社交媒体流数据融合,开发多源信息集成的人道主义态势感知平台,提升危机响应的实时性与精准度。
以上内容由遇见数据集搜集并总结生成



