electricsheepafrica/africa-jowhar-district-conflict-and-security-assessment-2015
收藏Hugging Face2026-04-11 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-jowhar-district-conflict-and-security-assessment-2015
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- no-annotation
language_creators:
- found
language:
- en
license: cc-by-4.0
multilinguality:
- monolingual
size_categories:
- n<1K
source_datasets:
- original
task_categories:
- tabular-classification
- tabular-regression
- other
task_ids: []
tags:
- africa
- humanitarian
- hdx
- electric-sheep-africa
- complex-emergency-conflict-security
- som
pretty_name: "Jowhar District Conflict and Security Assessment - 2015"
dataset_info:
splits:
- name: train
num_examples: 152
- name: test
num_examples: 38
---
# Jowhar District Conflict and Security Assessment - 2015
**Publisher:** Observatory of Conflict and Violence Prevention (inactive) · **Source:** [HDX](https://data.humdata.org/dataset/jowhar-district-conflict-and-security-assessment-2015) · **License:** `cc-by-igo` · **Updated:** 2023-03-03
---
## Abstract
As part of the continual assessments of issues affecting community security and safety, OCVP conducted extensive primary data collection in Jowhar district.
Further details @ http://www.ocvp.org/ocvp5/index.php/publications/dcsa/41-jowhar-district-conflict-and-security-assessment-report-2015
Each row in this dataset represents subnational administrative unit observations. Data was last updated on HDX on 2023-03-03. Geographic scope: **SOM**.
*Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).*
---
## Dataset Characteristics
| | |
|---|---|
| **Domain** | Public health |
| **Unit of observation** | Subnational administrative unit observations |
| **Rows (total)** | 191 |
| **Columns** | 123 (32 numeric, 91 categorical, 0 datetime) |
| **Train split** | 152 rows |
| **Test split** | 38 rows |
| **Geographic scope** | SOM |
| **Publisher** | Observatory of Conflict and Violence Prevention (inactive) |
| **HDX last updated** | 2023-03-03 |
---
## Variables
**Geographic** — `region_name` (range 1.0–1.0), `district_name` (range 1.0–1.0), `reporting_petty_crime` (range 1.0–888.0), `reporting_petty_other` ( , Qaraabada), `police_yearly_trend` (range 1.0–777.0) and 24 others.
**Demographic** — `village_name` (range 1.0–4.0), `gender_responder` (range 1.0–2.0), `age` (range 1.0–6.0).
**Outcome / Measurement** — `number_of_stations` (range 1.0–777.0), `number_of_stations_other` ( ), `number_of_courts` (1, , 2), `number_of_courts_other` ( ), `number_of_conflicts` and 2 others.
**Identifier / Metadata** — `legal_clinic_ref`, `legal_clinic_ref_other`, `court_ref`, `court_ref_other`, `elders_ref` and 8 others.
**Other** — `marital_status` (range 1.0–4.0), `level_education` (range 1.0–7.0), `police_presense` (range 1.0–777.0), `distance_to_station` (range 1.0–777.0), `reporting_civil` (range 1.0–888.0) and 66 others.
---
## Quick Start
```python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-jowhar-district-conflict-and-security-assessment-2015")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
```
---
## Schema
| Column | Type | Null % | Range / Sample Values |
|---|---|---|---|
| `region_name` | int64 | 0.0% | 1.0 – 1.0 (mean 1.0) |
| `district_name` | int64 | 0.0% | 1.0 – 1.0 (mean 1.0) |
| `village_name` | int64 | 0.0% | 1.0 – 4.0 (mean 2.5026) |
| `gender_responder` | int64 | 0.0% | 1.0 – 2.0 (mean 1.3979) |
| `age` | int64 | 0.0% | 1.0 – 6.0 (mean 3.178) |
| `marital_status` | int64 | 0.0% | 1.0 – 4.0 (mean 2.1414) |
| `level_education` | int64 | 0.0% | 1.0 – 7.0 (mean 4.0628) |
| `police_presense` | int64 | 0.0% | 1.0 – 777.0 (mean 5.1204) |
| `number_of_stations` | float64 | 6.3% | 1.0 – 777.0 (mean 5.3464) |
| `number_of_stations_other` | object | 0.0% | |
| `distance_to_station` | float64 | 6.3% | 1.0 – 777.0 (mean 9.6872) |
| `reporting_civil` | int64 | 0.0% | 1.0 – 888.0 (mean 12.6126) |
| `reporting_civil_other` | object | 0.0% | , Gudimiyaha gobolka, Family |
| `reporting_petty_crime` | int64 | 0.0% | 1.0 – 888.0 (mean 17.0942) |
| `reporting_petty_other` | object | 0.0% | , Qaraabada |
| `reporting_serious_crime` | int64 | 0.0% | 1.0 – 888.0 (mean 24.2618) |
| `reporting_serious_other` | object | 0.0% | |
| `trusted_sec_prov` | int64 | 0.0% | 1.0 – 777.0 (mean 10.3613) |
| `trusted_sec_other` | object | 0.0% | , Gudomiyaha gobolka, Gudomiyha gobolka |
| `reason_for_choice_sec` | float64 | 3.7% | 1.0 – 888.0 (mean 7.4402) |
| `reason_for_choice_sec_other` | object | 0.0% | , Iyaga lagu kala baxi ka, Walaga wada baqaya |
| `level_trust_police` | int64 | 0.0% | 1.0 – 888.0 (mean 39.7173) |
| `police_yearly_trend` | int64 | 0.0% | 1.0 – 777.0 (mean 66.4817) |
| `court_presense` | int64 | 0.0% | 1.0 – 777.0 (mean 9.3613) |
| `number_of_courts` | object | 0.0% | 1, , 2 |
| `number_of_courts_other` | object | 0.0% | |
| `where_is_court` | object | 0.0% | 1, , 2 |
| `distance_to_court` | object | 0.0% | 1, , 2 |
| `legal_clinic_aware` | int64 | 0.0% | 1.0 – 777.0 (mean 62.8482) |
| `legal_clinic_use` | object | 0.0% | |
| `legal_clinic_ref` | object | 0.0% | |
| `legal_clinic_ref_other` | object | 0.0% | |
| `legal_clinic_issue` | object | 0.0% | |
| `legal_clinic_issue_other` | object | 0.0% | |
| `legal_clinic_judgement` | object | 0.0% | |
| `legal_clinic_enforced` | object | 0.0% | |
| `court_use` | int64 | 0.0% | 1.0 – 2.0 (mean 1.644) |
| `court_ref` | object | 0.0% | |
| `court_ref_other` | object | 0.0% | |
| `court_issue` | object | 0.0% | |
| `court_issue_other` | object | 0.0% | |
| `court_judgement` | object | 0.0% | |
| `court_enforced` | object | 0.0% | |
| `elders_use` | int64 | 0.0% | |
| `elders_ref` | object | 0.0% | |
| `elders_ref_other` | object | 0.0% | |
| `elders_issue` | object | 0.0% | |
| `elders_issue_other` | object | 0.0% | |
| `elders_judgement` | object | 0.0% | |
| `elders_enforced` | object | 0.0% | |
| `religious_use` | int64 | 0.0% | |
| `religious_ref` | object | 0.0% | |
| `religious_ref_other` | object | 0.0% | |
| `religious_issue` | object | 0.0% | |
| `religious_issue_other` | object | 0.0% | |
| `religious_judgement` | object | 0.0% | |
| `religious_enforced` | object | 0.0% | |
| `trusted_just_prov` | int64 | 0.0% | |
| `trusted_just_prov_other` | object | 0.0% | |
| `reason_for_choice_just` | float64 | 5.8% | |
| `reason_for_choice_just_other` | object | 0.0% | |
| `conf_formal_just` | int64 | 0.0% | |
| `court_yearly_trend` | int64 | 0.0% | |
| `local_council_aware` | int64 | 0.0% | |
| `aware_of_services` | object | 0.0% | |
| `channels_comm` | object | 0.0% | |
| `consultation_participation` | object | 0.0% | |
| `participation_frequency` | object | 0.0% | |
| `participation_frequency_other` | object | 0.0% | |
| `elected_opinion` | int64 | 0.0% | |
| `loc_gov_serviceseducation` | object | 0.0% | |
| `loc_gov_serviceshealth` | object | 0.0% | |
| `loc_gov_servicessecurity` | object | 0.0% | |
| `loc_gov_servicesjustice` | object | 0.0% | |
| `loc_gov_servicesagriculture` | object | 0.0% | |
| `loc_gov_servicesinfrastructure` | object | 0.0% | |
| `loc_gov_servicessanitation` | object | 0.0% | |
| `loc_gov_serviceswater` | object | 0.0% | |
| `loc_gov_servicesother` | object | 0.0% | |
| `loc_gov_servicesdont_know` | object | 0.0% | |
| `loc_gov_servicesrefused_to_answer` | object | 0.0% | |
| `loc_gov_services_other` | object | 0.0% | |
| `community_issueslack_of_water` | object | 0.0% | |
| `community_issuesdrought` | object | 0.0% | |
| `community_issueslack_of_infrastructure` | object | 0.0% | |
| `community_issuespoor_sanitation` | object | 0.0% | |
| `community_issuespoor_health` | object | 0.0% | |
| `community_issuesunemployment` | object | 0.0% | |
| `community_issuespoor_education` | object | 0.0% | |
| `community_issuesshortage_of_electicity_supply` | object | 0.0% | |
| `community_issuespoor_economy` | object | 0.0% | |
| `community_issuescharcoal_production_deforestation` | object | 0.0% | |
| `community_issuesbad_health_centers` | object | 0.0% | |
| `community_issuesinsecurity` | object | 0.0% | |
| `community_issuesgender_based_violence` | object | 0.0% | |
| `community_issuesother` | object | 0.0% | |
| `community_issuesdont_know` | object | 0.0% | |
| `community_issuesrefused_to_answer` | object | 0.0% | |
| `community_issues_other` | object | 0.0% | |
| `council_yearly_trend` | object | 0.0% | |
| `witnessed_conflict` | int64 | 0.0% | |
| `number_of_conflicts` | object | 0.0% | |
| `number_conf_violence` | object | 0.0% | |
| `number_casualties` | object | 0.0% | |
| `conflict_reasonresources` | object | 0.0% | |
| `conflict_reasonfamily_disputes` | object | 0.0% | |
| `conflict_reasoncrime` | object | 0.0% | |
| `conflict_reasonpower` | object | 0.0% | |
| `conflict_reasonrevenge` | object | 0.0% | |
| `conflict_reasonbusiness_disputes` | object | 0.0% | |
| `conflict_reasonrape` | object | 0.0% | |
| `conflict_reasonlack_of_justice` | object | 0.0% | |
| `conflict_reasonother` | object | 0.0% | |
| `conflict_reasondont_know` | object | 0.0% | |
| `conflict_reasonrefused_to_answer` | object | 0.0% | |
| `conflict_reason_other` | object | 0.0% | |
| `witnessed_crimes` | int64 | 0.0% | |
| `how_safe` | int64 | 0.0% | |
| `safety_yearly_trend` | int64 | 0.0% | |
| `nspc` | object | 0.0% | |
| `njpc` | object | 0.0% | |
| `esa_source` | object | 0.0% | |
| `esa_processed` | object | 0.0% | |
---
## Numeric Summary
| Column | Min | Max | Mean | Median |
|---|---|---|---|---|
| `region_name` | 1.0 | 1.0 | 1.0 | 1.0 |
| `district_name` | 1.0 | 1.0 | 1.0 | 1.0 |
| `village_name` | 1.0 | 4.0 | 2.5026 | 2.0 |
| `gender_responder` | 1.0 | 2.0 | 1.3979 | 1.0 |
| `age` | 1.0 | 6.0 | 3.178 | 3.0 |
| `marital_status` | 1.0 | 4.0 | 2.1414 | 2.0 |
| `level_education` | 1.0 | 7.0 | 4.0628 | 4.0 |
| `police_presense` | 1.0 | 777.0 | 5.1204 | 1.0 |
| `number_of_stations` | 1.0 | 777.0 | 5.3464 | 1.0 |
| `distance_to_station` | 1.0 | 777.0 | 9.6872 | 1.0 |
| `reporting_civil` | 1.0 | 888.0 | 12.6126 | 5.0 |
| `reporting_petty_crime` | 1.0 | 888.0 | 17.0942 | 5.0 |
| `reporting_serious_crime` | 1.0 | 888.0 | 24.2618 | 2.0 |
| `trusted_sec_prov` | 1.0 | 777.0 | 10.3613 | 2.0 |
| `reason_for_choice_sec` | 1.0 | 888.0 | 7.4402 | 2.0 |
---
## Curation
Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 4 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet.
---
## Limitations
- Data originates from Observatory of Conflict and Violence Prevention (inactive) and has not been independently validated by ESA.
- Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection.
- Refer to the [original HDX dataset page](https://data.humdata.org/dataset/jowhar-district-conflict-and-security-assessment-2015) for the publisher's own methodology notes and caveats.
---
## Citation
```bibtex
@dataset{hdx_africa_jowhar_district_conflict_and_security_assessment_2015,
title = {Jowhar District Conflict and Security Assessment - 2015},
author = {Observatory of Conflict and Violence Prevention (inactive)},
year = {2023},
url = {https://data.humdata.org/dataset/jowhar-district-conflict-and-security-assessment-2015},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
```
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
搜集汇总
数据集介绍

构建方式
在冲突与安全评估领域,数据集的构建往往依赖于实地调研与结构化数据采集。本数据集由冲突与暴力预防观察站通过广泛的原始数据收集工作构建,聚焦于索马里朱哈尔地区的社区安全议题。数据采集过程涉及对次国家级行政单元的观察记录,涵盖了地理、人口、安全感知及司法服务等多个维度。原始数据经由人道主义数据交换平台发布,并由Electric Sheep Africa团队进行标准化处理,统一缺失值标记并优化数据类型,最终转换为适合机器学习任务的Parquet格式,确保了数据的可访问性与分析效率。
使用方法
在机器学习与数据分析实践中,该数据集可直接通过Hugging Face的datasets库加载,实现便捷的访问与预处理。用户可使用Python代码调用load_dataset函数,将数据载入为Pandas DataFrame,以便进行探索性分析与模型构建。数据集适用于表格分类、回归及其他相关任务,能够支持安全态势预测、社区服务需求评估等研究。鉴于其结构化特征与清晰的变量定义,研究者可灵活选取地理、人口或安全相关变量作为特征,结合领域知识构建预测模型,深入探究冲突与安全的内在关联。
背景与挑战
背景概述
在冲突与安全研究领域,针对特定区域的微观层面评估对于理解社区动态至关重要。Jowhar District Conflict and Security Assessment - 2015数据集由现已停止活动的冲突与暴力预防观察站于2015年创建,旨在通过索马里朱哈尔地区的实地数据收集,深入探究影响社区安全与稳定的核心问题。该数据集涵盖了地理、人口、司法服务、犯罪报告及社区议题等多维度变量,为公共健康与安全领域的量化分析提供了珍贵的地方性实证基础。其发布不仅丰富了非洲冲突研究的数据资源,也为后续机器学习模型在复杂人道主义情境下的应用奠定了结构化数据支撑。
当前挑战
该数据集致力于解决冲突与安全评估中的多变量关联分析挑战,涉及社区安全感知、司法服务可及性及冲突驱动因素等复杂议题的量化建模。在构建过程中,原始数据采集面临高危环境下的调查实施困难,可能导致样本代表性受限;同时,数据中存在大量分类变量与缺失值标记,如“777”与“888”等特殊编码,增加了数据清洗与解释的复杂性。此外,数据集规模较小,仅包含191条观测记录,对机器学习模型的泛化能力构成显著约束,且字段定义可能存在不一致性,需依赖原始方法论说明进行谨慎解读。
常用场景
经典使用场景
在冲突与安全研究领域,该数据集为分析索马里朱哈尔地区社区安全动态提供了微观层面的实证基础。研究者通常利用其丰富的变量,如警务存在、犯罪报告频率、司法机构可及性等,构建回归模型或分类算法,以揭示影响当地安全感知的关键因素。通过机器学习方法,学者能够量化不同社会人口特征与安全指标之间的关联,从而深入理解冲突环境中社区韧性的形成机制。
解决学术问题
该数据集有效解决了冲突研究中对子国家层面安全评估数据稀缺的学术难题。它使得研究者能够检验关于警务效能、司法可及性与社区信任之间关系的理论假设,并为复杂紧急状态下的人类安全测量提供标准化指标。其意义在于推动了定量方法在脆弱环境研究中的应用,为跨区域比较安全治理模式奠定了数据基础,从而深化了对冲突后社会重建机制的理解。
实际应用
在实际应用中,人道主义组织与地方治理机构可借助该数据集进行精准的需求评估与干预规划。例如,通过分析不同村庄对安全提供者的信任差异,能够优化社区警务项目的资源分配;识别冲突热点与司法服务缺口,则有助于设计针对性的法治支持项目。这些数据驱动的洞察为在索马里等脆弱国家实施证据本位的和平建设与安全部门改革提供了关键参考。
数据集最近研究
最新研究方向
在冲突与安全研究领域,基于社区层面的微观数据正成为理解脆弱地区动态的关键。Jowhar District Conflict and Security Assessment - 2015数据集以其详实的子行政单位观测记录,为探索索马里等复杂紧急状态下的安全治理机制提供了实证基础。当前前沿研究聚焦于利用机器学习方法,如表格分类与回归模型,分析警务存在、法庭使用及冲突诱因等多维变量间的非线性关联。这些工作不仅助力预测社区安全趋势,更与全球人道主义响应中的精准干预热点相呼应,推动数据驱动型政策制定在资源受限环境中的应用,对提升冲突预防与恢复实践的效能具有深远意义。
以上内容由遇见数据集搜集并总结生成



