electricsheepafrica/africa-gardo-district-conflict-and-security-assessment-report-2015
收藏Hugging Face2026-04-11 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-gardo-district-conflict-and-security-assessment-report-2015
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- no-annotation
language_creators:
- found
language:
- en
license: cc-by-4.0
multilinguality:
- monolingual
size_categories:
- n<1K
source_datasets:
- original
task_categories:
- tabular-classification
- other
task_ids: []
tags:
- africa
- humanitarian
- hdx
- electric-sheep-africa
- conflict-violence
- som
pretty_name: "Gardo District Conflict and Security Assessment Report - 2015"
dataset_info:
splits:
- name: train
num_examples: 156
- name: test
num_examples: 39
---
# Gardo District Conflict and Security Assessment Report - 2015
**Publisher:** Observatory of Conflict and Violence Prevention (inactive) · **Source:** [HDX](https://data.humdata.org/dataset/gardo-district-conflict-and-security-assessment-report-2015) · **License:** `cc-by-igo` · **Updated:** 2023-03-03
---
## Abstract
As a part of its continual assessment of issues directly affecting community security and safety, OCVP conducted an extensive collection of primary data in Gardo District, the capital of Karkaar region of Puntland.
Each row in this dataset represents subnational administrative unit observations. Data was last updated on HDX on 2023-03-03. Geographic scope: **SOM**.
*Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).*
---
## Dataset Characteristics
| | |
|---|---|
| **Domain** | Public health |
| **Unit of observation** | Subnational administrative unit observations |
| **Rows (total)** | 196 |
| **Columns** | 149 (34 numeric, 115 categorical, 0 datetime) |
| **Train split** | 156 rows |
| **Test split** | 39 rows |
| **Geographic scope** | SOM |
| **Publisher** | Observatory of Conflict and Violence Prevention (inactive) |
| **HDX last updated** | 2023-03-03 |
---
## Variables
**Geographic** — `region_name` (range 1.0–1.0), `district_name` (range 1.0–1.0), `reporting_petty_crime` (range 1.0–777.0), `reporting_petty_other` ( , Gudiga xalinta khilafaadka, Gudiga), `police_yearly_trend` (range 1.0–777.0) and 33 others.
**Demographic** — `village_name` (range 1.0–5.0), `gender_responder` (range 1.0–2.0), `age` (range 1.0–6.0), `legal_clinic_issuehhviolence`, `court_issuehhviolence` and 2 others.
**Outcome / Measurement** — `number_of_stations` (range 1.0–777.0), `number_of_stations_other` ( ), `number_of_courts` (range 1.0–777.0), `number_of_courts_other` ( ), `number_of_conflicts` and 2 others.
**Identifier / Metadata** — `legal_clinic_ref` ( ), `legal_clinic_ref_other`, `legal_clinic_issuebusidisputes`, `court_ref`, `court_ref_other` and 11 others.
**Other** — `marital_status` (range 1.0–888.0), `level_education` (range 1.0–7.0), `police_presense` (range 1.0–777.0), `distance_to_station` (range 1.0–777.0), `reporting_civil` (range 1.0–777.0) and 76 others.
---
## Quick Start
```python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-gardo-district-conflict-and-security-assessment-report-2015")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
```
---
## Schema
| Column | Type | Null % | Range / Sample Values |
|---|---|---|---|
| `region_name` | int64 | 0.0% | 1.0 – 1.0 (mean 1.0) |
| `district_name` | int64 | 0.0% | 1.0 – 1.0 (mean 1.0) |
| `village_name` | int64 | 0.0% | 1.0 – 5.0 (mean 3.0153) |
| `gender_responder` | int64 | 0.0% | 1.0 – 2.0 (mean 1.4898) |
| `age` | int64 | 0.0% | 1.0 – 6.0 (mean 3.1582) |
| `marital_status` | int64 | 0.0% | 1.0 – 888.0 (mean 15.6684) |
| `level_education` | int64 | 0.0% | 1.0 – 7.0 (mean 3.4235) |
| `police_presense` | int64 | 0.0% | 1.0 – 777.0 (mean 24.7755) |
| `number_of_stations` | float64 | 5.1% | 1.0 – 777.0 (mean 63.6828) |
| `number_of_stations_other` | object | 0.0% | |
| `distance_to_station` | float64 | 5.1% | 1.0 – 777.0 (mean 52.3763) |
| `reporting_civil` | int64 | 0.0% | 1.0 – 777.0 (mean 35.5561) |
| `reporting_civil_other` | object | 0.0% | , Gudiga xaafada, Guddi xafadda |
| `reporting_petty_crime` | int64 | 0.0% | 1.0 – 777.0 (mean 31.5408) |
| `reporting_petty_other` | object | 0.0% | , Gudiga xalinta khilafaadka, Gudiga |
| `reporting_serious_crime` | int64 | 0.0% | 1.0 – 777.0 (mean 31.4184) |
| `reporting_serious_other` | object | 0.0% | , Hadba kii ku dhow |
| `trusted_sec_prov` | int64 | 0.0% | 1.0 – 777.0 (mean 35.3673) |
| `trusted_sec_other` | object | 0.0% | , Kulli |
| `reason_for_choice_sec` | float64 | 6.1% | 1.0 – 777.0 (mean 11.2065) |
| `reason_for_choice_sec_other` | object | 0.0% | , Waa lag baqaah, Wa dad wayo atag weeye |
| `level_trust_police` | int64 | 0.0% | 1.0 – 777.0 (mean 42.7806) |
| `police_yearly_trend` | int64 | 0.0% | 1.0 – 777.0 (mean 92.8214) |
| `court_presense` | int64 | 0.0% | 1.0 – 777.0 (mean 44.602) |
| `number_of_courts` | float64 | 10.7% | 1.0 – 777.0 (mean 23.1886) |
| `number_of_courts_other` | object | 0.0% | |
| `where_is_court` | float64 | 10.7% | 1.0 – 777.0 (mean 36.5486) |
| `distance_to_court` | object | 0.0% | 3, , 2 |
| `legal_clinic_aware` | int64 | 0.0% | |
| `legal_clinic_use` | object | 0.0% | , 2 |
| `legal_clinic_ref` | object | 0.0% | |
| `legal_clinic_ref_other` | object | 0.0% | |
| `legal_clinic_issuelanddispute` | object | 0.0% | |
| `legal_clinic_issuebusidisputes` | object | 0.0% | |
| `legal_clinic_issuerobbery` | object | 0.0% | |
| `legal_clinic_issueyouthviol` | object | 0.0% | |
| `legal_clinic_issuehhviolence` | object | 0.0% | |
| `legal_clinic_issueassault` | object | 0.0% | |
| `legal_clinic_issueother` | object | 0.0% | |
| `legal_clinic_issuerta` | object | 0.0% | |
| `legal_clinic_issue_other` | object | 0.0% | |
| `legal_clinic_judgement` | object | 0.0% | |
| `legal_clinic_enforced` | object | 0.0% | |
| `court_use` | int64 | 0.0% | |
| `court_ref` | object | 0.0% | |
| `court_ref_other` | object | 0.0% | |
| `court_issuelanddispute` | object | 0.0% | |
| `court_issuebusidisputes` | object | 0.0% | |
| `court_issuerobbery` | object | 0.0% | |
| `court_issueyouthviol` | object | 0.0% | |
| `court_issuehhviolence` | object | 0.0% | |
| `court_issueassault` | object | 0.0% | |
| `court_issueother` | object | 0.0% | |
| `court_issuerta` | object | 0.0% | |
| `court_issue_other` | object | 0.0% | |
| `court_judgement` | object | 0.0% | |
| `court_enforced` | object | 0.0% | |
| `elders_use` | int64 | 0.0% | |
| `elders_ref` | object | 0.0% | |
| `elders_ref_other` | object | 0.0% | |
| `elders_issuelanddispute` | object | 0.0% | |
| `elders_issuebusidisputes` | object | 0.0% | |
| `elders_issuerobbery` | object | 0.0% | |
| `elders_issueyouthviol` | object | 0.0% | |
| `elders_issuehhviolence` | object | 0.0% | |
| `elders_issueassault` | object | 0.0% | |
| `elders_issueother` | object | 0.0% | |
| `elders_issuerta` | object | 0.0% | |
| `elders_issue_other` | object | 0.0% | |
| `elders_judgement` | object | 0.0% | |
| `elders_enforced` | object | 0.0% | |
| `religious_use` | int64 | 0.0% | |
| `religious_ref` | object | 0.0% | |
| `religious_ref_other` | object | 0.0% | |
| `religious_issuelanddispute` | object | 0.0% | |
| `religious_issuebusidisputes` | object | 0.0% | |
| `religious_issuerobbery` | object | 0.0% | |
| `religious_issueyouthviol` | object | 0.0% | |
| `religious_issuehhviolence` | object | 0.0% | |
| `religious_issueassault` | object | 0.0% | |
| `religious_issueother` | object | 0.0% | |
| `religious_issuerta` | object | 0.0% | |
| `religious_issue_other` | object | 0.0% | |
| `religious_judgement` | object | 0.0% | |
| `religious_enforced` | object | 0.0% | |
| `trusted_just_prov` | int64 | 0.0% | |
| `trusted_just_prov_other` | object | 0.0% | |
| `reason_for_choice_just` | object | 0.0% | |
| `reason_for_choice_just_other` | object | 0.0% | |
| `conf_formal_just` | int64 | 0.0% | |
| `court_yearly_trend` | int64 | 0.0% | |
| `local_council_aware` | int64 | 0.0% | |
| `loc_gov_serviceseducation` | object | 0.0% | |
| `loc_gov_serviceshealth` | object | 0.0% | |
| `loc_gov_servicessecurity` | object | 0.0% | |
| `loc_gov_servicesjustice` | object | 0.0% | |
| `loc_gov_servicesagriculture` | object | 0.0% | |
| `loc_gov_servicesinfrastructure` | object | 0.0% | |
| `loc_gov_servicessanitation` | object | 0.0% | |
| `loc_gov_serviceswater` | object | 0.0% | |
| `loc_gov_servicesother` | object | 0.0% | |
| `loc_gov_servicesdontknow` | object | 0.0% | |
| `loc_gov_servicesrta` | object | 0.0% | |
| `loc_gov_services_other` | object | 0.0% | |
| `channels_comm` | object | 0.0% | |
| `consultation_participation` | int64 | 0.0% | |
| `participation_frequency` | object | 0.0% | |
| `participation_frequency_other` | object | 0.0% | |
| `elected_opinion` | int64 | 0.0% | |
| `community_issueslackofwater` | object | 0.0% | |
| `community_issuesdrought` | object | 0.0% | |
| `community_issueslofinfrastructure` | object | 0.0% | |
| `community_issuespoorsanitation` | object | 0.0% | |
| `community_issuespoorhealth` | object | 0.0% | |
| `community_issuesunemployment` | object | 0.0% | |
| `community_issuespooreducation` | object | 0.0% | |
| `community_issuesshortelectsupply` | object | 0.0% | |
| `community_issuespooreconomy` | object | 0.0% | |
| `community_issuescharcoalpdefor` | object | 0.0% | |
| `community_issuesbadhealthc` | object | 0.0% | |
| `community_issuesinsecurity` | object | 0.0% | |
| `community_issuesgenderbasedv` | object | 0.0% | |
| `community_issuesother` | object | 0.0% | |
| `community_issuesdontknow` | object | 0.0% | |
| `community_issuesrta` | object | 0.0% | |
| `community_issues_other` | object | 0.0% | |
| `council_yearly_trend` | object | 0.0% | |
| `witnessed_conflict` | int64 | 0.0% | |
| `number_of_conflicts` | object | 0.0% | |
| `number_conf_violence` | object | 0.0% | |
| `number_casualties` | object | 0.0% | |
| `conflict_reasonresources` | object | 0.0% | |
| `conflict_reasonfamilydisp` | object | 0.0% | |
| `conflict_reasoncrime` | object | 0.0% | |
| `conflict_reasonpower` | object | 0.0% | |
| `conflict_reasonrevenge` | object | 0.0% | |
| `conflict_reasonbusidisputes` | object | 0.0% | |
| `conflict_reasonrape` | object | 0.0% | |
| `conflict_reasonlackofjustice` | object | 0.0% | |
| `conflict_reasonyouthviol` | object | 0.0% | |
| `conflict_reasonother` | object | 0.0% | |
| `conflict_reasondontknow` | object | 0.0% | |
| `conflict_reasonrta` | object | 0.0% | |
| `conflict_reason_other` | object | 0.0% | |
| `witnessed_crimes` | int64 | 0.0% | |
| `how_safe` | int64 | 0.0% | |
| `safety_yearly_trend` | int64 | 0.0% | |
| `esa_source` | object | 0.0% | |
| `esa_processed` | object | 0.0% | |
---
## Numeric Summary
| Column | Min | Max | Mean | Median |
|---|---|---|---|---|
| `region_name` | 1.0 | 1.0 | 1.0 | 1.0 |
| `district_name` | 1.0 | 1.0 | 1.0 | 1.0 |
| `village_name` | 1.0 | 5.0 | 3.0153 | 3.0 |
| `gender_responder` | 1.0 | 2.0 | 1.4898 | 1.0 |
| `age` | 1.0 | 6.0 | 3.1582 | 3.0 |
| `marital_status` | 1.0 | 888.0 | 15.6684 | 2.0 |
| `level_education` | 1.0 | 7.0 | 3.4235 | 3.0 |
| `police_presense` | 1.0 | 777.0 | 24.7755 | 1.0 |
| `number_of_stations` | 1.0 | 777.0 | 63.6828 | 1.0 |
| `distance_to_station` | 1.0 | 777.0 | 52.3763 | 2.0 |
| `reporting_civil` | 1.0 | 777.0 | 35.5561 | 5.0 |
| `reporting_petty_crime` | 1.0 | 777.0 | 31.5408 | 5.0 |
| `reporting_serious_crime` | 1.0 | 777.0 | 31.4184 | 5.0 |
| `trusted_sec_prov` | 1.0 | 777.0 | 35.3673 | 5.0 |
| `reason_for_choice_sec` | 1.0 | 777.0 | 11.2065 | 3.0 |
---
## Curation
Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 5 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet.
---
## Limitations
- Data originates from Observatory of Conflict and Violence Prevention (inactive) and has not been independently validated by ESA.
- Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection.
- Refer to the [original HDX dataset page](https://data.humdata.org/dataset/gardo-district-conflict-and-security-assessment-report-2015) for the publisher's own methodology notes and caveats.
---
## Citation
```bibtex
@dataset{hdx_africa_gardo_district_conflict_and_security_assessment_report_2015,
title = {Gardo District Conflict and Security Assessment Report - 2015},
author = {Observatory of Conflict and Violence Prevention (inactive)},
year = {2023},
url = {https://data.humdata.org/dataset/gardo-district-conflict-and-security-assessment-report-2015},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
```
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
搜集汇总
数据集介绍

构建方式
在冲突与安全研究领域,数据的系统化采集对于理解社区安全动态至关重要。本数据集源于非洲加尔多地区冲突与安全评估报告,由冲突与暴力预防观察站通过实地调查收集原始数据,聚焦于索马里邦特兰卡卡尔地区首府加尔多区。数据采集以行政单元为观测单位,涵盖地理、人口、安全感知及司法服务等多个维度。原始数据经由人道主义数据交换平台发布,并由Electric Sheep Africa团队进行标准化处理,统一缺失值标记并优化数据类型,最终转换为适合机器学习分析的Parquet格式,确保了数据的结构化和可用性。
使用方法
在机器学习与社会科学交叉研究中,本数据集为冲突预测与安全评估模型构建提供了实证基础。用户可通过Hugging Face的datasets库直接加载数据,利用Python环境进行快速访问与探索。数据集已预分为训练集与测试集,支持转换为Pandas DataFrame以进行统计分析或特征工程。研究者可基于地理、人口及安全相关变量,开发分类或回归模型,用于预测安全趋势或评估公共服务影响。需要注意的是,数据源自特定机构的原始调查,使用时应结合原始方法论说明,并考虑潜在的数据局限性,以确保分析结论的稳健性。
背景与挑战
背景概述
在冲突与安全研究领域,对特定区域进行系统性评估是理解社区安全动态、制定有效干预策略的基础。Gardo District Conflict and Security Assessment Report - 2015数据集由现已停止活动的冲突与暴力预防观察站(OCVP)于2015年创建,旨在通过收集索马里邦特兰地区加尔多区的实地数据,深入探究影响社区安全的核心问题。该数据集涵盖了地理、人口、司法服务、冲突事件等多维度变量,为研究人员提供了分析冲突根源、评估安全治理成效的宝贵资源。其发布不仅丰富了非洲地区冲突研究的实证资料,也为后续机器学习应用在公共安全领域的探索奠定了数据基础。
当前挑战
该数据集致力于解决冲突与安全评估中的复杂问题,其核心挑战在于如何从多维度的社区数据中准确识别安全风险因素与治理效能。构建过程中面临诸多困难,包括在动荡地区进行数据收集时可能遇到的访问限制与受访者信任缺失,以及原始数据中存在的报告不一致、缺失值标记多样等问题。此外,数据集规模较小(总计196行),变量间可能存在高度不平衡,这限制了模型训练的泛化能力。自动化清洗虽统一了缺失值,但难以纠正原始数据中的定义偏差或抽样偏见,对分析结果的可靠性构成潜在影响。
常用场景
经典使用场景
在冲突与安全研究领域,该数据集作为典型的社区安全评估案例,常被用于构建机器学习模型以预测区域冲突风险。研究者利用其丰富的变量,如警务存在、犯罪报告频率及司法机构可及性等,训练分类或回归算法,旨在识别影响社区安全的关键因素。这种应用不仅深化了对索马里加尔多地区安全动态的理解,也为类似脆弱环境的分析提供了可复现的模板。
解决学术问题
该数据集有效解决了冲突研究中数据稀缺与细粒度分析不足的学术难题。通过提供详尽的子行政单位观测数据,它支持学者探究安全感知、司法信任与冲突诱因之间的复杂关联。其意义在于为定量检验安全治理理论提供了实证基础,推动了人道主义研究从宏观描述向微观机制分析的转型,对理解后冲突社会的韧性构建具有深远影响。
实际应用
在实际层面,该数据集被非政府组织与地方政策制定者用于指导资源分配和干预策略设计。例如,依据警务覆盖与犯罪报告数据,机构可优化安全哨点布局;基于社区对司法服务的信任度评估,能够定制法治推广项目。这些应用直接提升了人道主义响应的精准性,助力在资源有限的环境中增强社区防护能力。
数据集最近研究
最新研究方向
在冲突与安全研究领域,数据集的最新应用聚焦于利用机器学习方法分析社区安全动态。该数据集记录了索马里加尔多地区冲突与安全评估的详细数据,涵盖警务存在、司法服务使用及社区冲突等多个维度。前沿研究正探索如何通过集成学习与自然语言处理技术,从这些结构化与半结构化变量中提取模式,以预测局部冲突风险并评估干预措施的效果。此类工作不仅深化了对脆弱地区安全机制的理解,也为国际人道主义组织提供了数据驱动的决策支持,尤其在资源分配与早期预警系统构建方面展现出重要价值。
以上内容由遇见数据集搜集并总结生成



