electricsheepafrica/africa-kismayo-district-conflict-and-security-assessment-2015
收藏Hugging Face2026-04-11 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-kismayo-district-conflict-and-security-assessment-2015
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- no-annotation
language_creators:
- found
language:
- en
license: cc-by-4.0
multilinguality:
- monolingual
size_categories:
- n<1K
source_datasets:
- original
task_categories:
- tabular-classification
- tabular-regression
- other
task_ids: []
tags:
- africa
- humanitarian
- hdx
- electric-sheep-africa
- som
pretty_name: "Kismayo District Conflict and Security Assessment - 2015"
dataset_info:
splits:
- name: train
num_examples: 161
- name: test
num_examples: 40
---
# Kismayo District Conflict and Security Assessment - 2015
**Publisher:** Observatory of Conflict and Violence Prevention (inactive) · **Source:** [HDX](https://data.humdata.org/dataset/kismayo-district-conflict-and-security-assessment-2015) · **License:** `cc-by-igo` · **Updated:** 2023-02-28
---
## Abstract
As part of the continual assessments of issues
affecting community security and safety, OCVP
conducted extensive. Further details @ http://www.ocvp.org/ocvp5/index.php/publications/dcsa/59-kismayo-district-conflict-and-security-assessment-report-2015
Each row in this dataset represents subnational administrative unit observations. Data was last updated on HDX on 2023-02-28. Geographic scope: **SOM**.
*Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).*
---
## Dataset Characteristics
| | |
|---|---|
| **Domain** | Public health |
| **Unit of observation** | Subnational administrative unit observations |
| **Rows (total)** | 202 |
| **Columns** | 123 (27 numeric, 96 categorical, 0 datetime) |
| **Train split** | 161 rows |
| **Test split** | 40 rows |
| **Geographic scope** | SOM |
| **Publisher** | Observatory of Conflict and Violence Prevention (inactive) |
| **HDX last updated** | 2023-02-28 |
---
## Variables
**Geographic** — `region_name` (Jubbada hoose), `district_name` (kismayo), `reporting_petty_crime` (range 1.0–888.0), `reporting_petty_other` ( , Deg), `police_yearly_trend` (range 1.0–888.0) and 24 others.
**Demographic** — `village_name` (Shaqalaha, Calanley, Farjano), `gender_responder` (range 1.0–2.0), `age` (range 1.0–6.0).
**Outcome / Measurement** — `number_of_stations` (1, , 2), `number_of_stations_other` ( ), `number_of_courts`, `number_of_courts_other`, `number_of_conflicts` and 2 others.
**Identifier / Metadata** — `legal_clinic_ref`, `legal_clinic_ref_other`, `court_ref`, `court_ref_other`, `elders_ref` and 8 others.
**Other** — `marital_status` (range 1.0–888.0), `level_education` (range 1.0–7.0), `police_presense` (range 1.0–777.0), `distance_to_station` (1, , 2), `reporting_civil` (range 1.0–888.0) and 66 others.
---
## Quick Start
```python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-kismayo-district-conflict-and-security-assessment-2015")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
```
---
## Schema
| Column | Type | Null % | Range / Sample Values |
|---|---|---|---|
| `region_name` | object | 0.0% | Jubbada hoose |
| `district_name` | object | 0.0% | kismayo |
| `village_name` | object | 0.0% | Shaqalaha, Calanley, Farjano |
| `gender_responder` | int64 | 0.0% | 1.0 – 2.0 (mean 1.4554) |
| `age` | int64 | 0.0% | 1.0 – 6.0 (mean 3.3614) |
| `marital_status` | int64 | 0.0% | 1.0 – 888.0 (mean 6.5545) |
| `level_education` | int64 | 0.0% | 1.0 – 7.0 (mean 3.3564) |
| `police_presense` | int64 | 0.0% | 1.0 – 777.0 (mean 16.5495) |
| `number_of_stations` | object | 0.0% | 1, , 2 |
| `number_of_stations_other` | object | 0.0% | |
| `distance_to_station` | object | 0.0% | 1, , 2 |
| `reporting_civil` | int64 | 0.0% | 1.0 – 888.0 (mean 30.1485) |
| `reporting_civil_other` | object | 0.0% | , Gudoomiyaha xaafada, Gooduniyaha xaafada |
| `reporting_petty_crime` | int64 | 0.0% | 1.0 – 888.0 (mean 34.797) |
| `reporting_petty_other` | object | 0.0% | , Deg |
| `reporting_serious_crime` | int64 | 0.0% | 1.0 – 888.0 (mean 34.2772) |
| `reporting_serious_other` | object | 0.0% | , Wal |
| `trusted_sec_prov` | int64 | 0.0% | 1.0 – 888.0 (mean 22.5693) |
| `trusted_sec_other` | object | 0.0% | , Cid |
| `reason_for_choice_sec` | float64 | 3.0% | 1.0 – 777.0 (mean 6.4694) |
| `reason_for_choice_sec_other` | object | 0.0% | |
| `level_trust_police` | int64 | 0.0% | 1.0 – 888.0 (mean 63.9257) |
| `police_yearly_trend` | int64 | 0.0% | 1.0 – 888.0 (mean 103.1485) |
| `court_presense` | int64 | 0.0% | 1.0 – 888.0 (mean 72.0396) |
| `number_of_courts` | object | 0.0% | |
| `number_of_courts_other` | object | 0.0% | |
| `where_is_court` | object | 0.0% | |
| `distance_to_court` | object | 0.0% | |
| `legal_clinic_aware` | int64 | 0.0% | 1.0 – 888.0 (mean 114.297) |
| `legal_clinic_use` | object | 0.0% | |
| `legal_clinic_ref` | object | 0.0% | |
| `legal_clinic_ref_other` | object | 0.0% | |
| `legal_clinic_issue` | object | 0.0% | |
| `legal_clinic_issue_other` | object | 0.0% | |
| `legal_clinic_judgement` | object | 0.0% | |
| `legal_clinic_enforced` | object | 0.0% | |
| `court_use` | int64 | 0.0% | 1.0 – 888.0 (mean 49.0446) |
| `court_ref` | object | 0.0% | |
| `court_ref_other` | object | 0.0% | |
| `court_issue` | object | 0.0% | |
| `court_issue_other` | object | 0.0% | |
| `court_judgement` | object | 0.0% | |
| `court_enforced` | object | 0.0% | |
| `elders_use` | int64 | 0.0% | 1.0 – 888.0 (mean 18.1634) |
| `elders_ref` | object | 0.0% | |
| `elders_ref_other` | object | 0.0% | |
| `elders_issue` | object | 0.0% | |
| `elders_issue_other` | object | 0.0% | |
| `elders_judgement` | object | 0.0% | |
| `elders_enforced` | object | 0.0% | |
| `religious_use` | int64 | 0.0% | 1.0 – 888.0 (mean 30.3564) |
| `religious_ref` | object | 0.0% | |
| `religious_ref_other` | object | 0.0% | |
| `religious_issue` | object | 0.0% | |
| `religious_issue_other` | object | 0.0% | |
| `religious_judgement` | object | 0.0% | |
| `religious_enforced` | object | 0.0% | |
| `trusted_just_prov` | int64 | 0.0% | 1.0 – 888.0 (mean 23.9554) |
| `trusted_just_prov_other` | object | 0.0% | |
| `reason_for_choice_just` | float64 | 8.9% | 1.0 – 6.0 (mean 2.5815) |
| `reason_for_choice_just_other` | object | 0.0% | |
| `conf_formal_just` | int64 | 0.0% | 1.0 – 888.0 (mean 64.3713) |
| `court_yearly_trend` | int64 | 0.0% | |
| `local_council_aware` | int64 | 0.0% | |
| `aware_of_services` | object | 0.0% | |
| `channels_comm` | object | 0.0% | |
| `consultation_participation` | object | 0.0% | |
| `participation_frequency` | object | 0.0% | |
| `participation_frequency_other` | object | 0.0% | |
| `elected_opinion` | int64 | 0.0% | |
| `loc_gov_serviceseducation` | object | 0.0% | |
| `loc_gov_serviceshealth` | object | 0.0% | |
| `loc_gov_servicessecurity` | object | 0.0% | |
| `loc_gov_servicesjustice` | object | 0.0% | |
| `loc_gov_servicesagriculture` | object | 0.0% | |
| `loc_gov_servicesinfrastructure` | object | 0.0% | |
| `loc_gov_servicessanitation` | object | 0.0% | |
| `loc_gov_serviceswater` | object | 0.0% | |
| `loc_gov_servicesother` | object | 0.0% | |
| `loc_gov_servicesdont_know` | object | 0.0% | |
| `loc_gov_servicesrefused_to_answer` | object | 0.0% | |
| `loc_gov_services_other` | object | 0.0% | |
| `community_issueslack_of_water` | object | 0.0% | |
| `community_issuesdrought` | object | 0.0% | |
| `community_issueslack_of_infrastructure` | object | 0.0% | |
| `community_issuespoor_sanitation` | object | 0.0% | |
| `community_issuespoor_health` | object | 0.0% | |
| `community_issuesunemployment` | object | 0.0% | |
| `community_issuespoor_education` | object | 0.0% | |
| `community_issuesshortage_of_electicity_supply` | object | 0.0% | |
| `community_issuespoor_economy` | object | 0.0% | |
| `community_issuescharcoal_production_deforestation` | object | 0.0% | |
| `community_issuesbad_health_centers` | object | 0.0% | |
| `community_issuesinsecurity` | object | 0.0% | |
| `community_issuesgender_based_violence` | object | 0.0% | |
| `community_issuesother` | object | 0.0% | |
| `community_issuesdont_know` | object | 0.0% | |
| `community_issuesrefused_to_answer` | object | 0.0% | |
| `community_issues_other` | object | 0.0% | |
| `council_yearly_trend` | object | 0.0% | |
| `witnessed_conflict` | int64 | 0.0% | |
| `number_of_conflicts` | object | 0.0% | |
| `number_conf_violence` | object | 0.0% | |
| `number_casualties` | object | 0.0% | |
| `conflict_reasonresources` | object | 0.0% | |
| `conflict_reasonfamily_disputes` | object | 0.0% | |
| `conflict_reasoncrime` | object | 0.0% | |
| `conflict_reasonpower` | object | 0.0% | |
| `conflict_reasonrevenge` | object | 0.0% | |
| `conflict_reasonbusiness_disputes` | object | 0.0% | |
| `conflict_reasonrape` | object | 0.0% | |
| `conflict_reasonlack_of_justice` | object | 0.0% | |
| `conflict_reasonother` | object | 0.0% | |
| `conflict_reasondont_know` | object | 0.0% | |
| `conflict_reasonrefused_to_answer` | object | 0.0% | |
| `conflict_reason_other` | object | 0.0% | |
| `witnessed_crimes` | int64 | 0.0% | |
| `how_safe` | int64 | 0.0% | |
| `safety_yearly_trend` | int64 | 0.0% | |
| `nspc` | object | 0.0% | |
| `njpc` | object | 0.0% | |
| `esa_source` | object | 0.0% | |
| `esa_processed` | object | 0.0% | |
---
## Numeric Summary
| Column | Min | Max | Mean | Median |
|---|---|---|---|---|
| `gender_responder` | 1.0 | 2.0 | 1.4554 | 1.0 |
| `age` | 1.0 | 6.0 | 3.3614 | 3.0 |
| `marital_status` | 1.0 | 888.0 | 6.5545 | 2.0 |
| `level_education` | 1.0 | 7.0 | 3.3564 | 3.0 |
| `police_presense` | 1.0 | 777.0 | 16.5495 | 1.0 |
| `reporting_civil` | 1.0 | 888.0 | 30.1485 | 2.0 |
| `reporting_petty_crime` | 1.0 | 888.0 | 34.797 | 2.5 |
| `reporting_serious_crime` | 1.0 | 888.0 | 34.2772 | 2.0 |
| `trusted_sec_prov` | 1.0 | 888.0 | 22.5693 | 2.0 |
| `reason_for_choice_sec` | 1.0 | 777.0 | 6.4694 | 2.0 |
| `level_trust_police` | 1.0 | 888.0 | 63.9257 | 2.0 |
| `police_yearly_trend` | 1.0 | 888.0 | 103.1485 | 1.0 |
| `court_presense` | 1.0 | 888.0 | 72.0396 | 1.0 |
| `legal_clinic_aware` | 1.0 | 888.0 | 114.297 | 2.0 |
| `court_use` | 1.0 | 888.0 | 49.0446 | 2.0 |
---
## Curation
Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 2 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet.
---
## Limitations
- Data originates from Observatory of Conflict and Violence Prevention (inactive) and has not been independently validated by ESA.
- Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection.
- Refer to the [original HDX dataset page](https://data.humdata.org/dataset/kismayo-district-conflict-and-security-assessment-2015) for the publisher's own methodology notes and caveats.
---
## Citation
```bibtex
@dataset{hdx_africa_kismayo_district_conflict_and_security_assessment_2015,
title = {Kismayo District Conflict and Security Assessment - 2015},
author = {Observatory of Conflict and Violence Prevention (inactive)},
year = {2023},
url = {https://data.humdata.org/dataset/kismayo-district-conflict-and-security-assessment-2015},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
```
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
搜集汇总
数据集介绍

构建方式
在冲突与安全评估领域,数据集的构建往往依赖于实地调研与结构化数据采集。本数据集源于冲突与暴力预防观察站于2015年在索马里基斯马尤地区开展的社区安全评估项目,通过系统性的问卷调查收集了地方行政单位的观测数据。原始数据经由人道主义数据交换平台发布,并由Electric Sheep Africa团队进行标准化处理,包括统一缺失值标记、规范列名格式以及将数据转换为Parquet格式,最终划分为训练集与测试集,为机器学习应用提供支持。
特点
该数据集在冲突与安全研究领域展现出鲜明的多维特征,涵盖了地理、人口、安全感知与司法服务等多个维度。数据包含202条观测记录,涉及123个变量,其中27个为数值型,96个为分类型,全面反映了基斯马尤地区的安全状况与社区服务可及性。数据集以子国家行政单位为观测单元,详细记录了警务存在、法院使用、冲突事件等关键指标,并包含大量分类变量以捕捉社区议题与安全趋势,为深入分析脆弱地区的安全动态提供了丰富的数据基础。
使用方法
在机器学习与数据分析实践中,该数据集适用于分类与回归任务,特别是针对安全趋势预测或服务可及性评估。用户可通过Hugging Face的datasets库直接加载数据集,利用Python环境进行探索性分析或模型训练。数据集已预分为训练集与测试集,支持转换为Pandas DataFrame以方便后续处理。研究者可依据地理、人口变量进行分层分析,或利用数值型指标构建预测模型,但需注意数据源自特定地区与时间点,其结论的普适性需谨慎评估。
背景与挑战
背景概述
在冲突与安全研究领域,对社区安全状况的量化评估是理解脆弱地区社会动态的关键。2015年,由已停止运作的冲突与暴力预防观察站(OCVP)创建的基斯马尤地区冲突与安全评估数据集,旨在系统记录索马里基斯马尤地区的地方行政单位在安全、司法及公共服务等方面的多维观测数据。该数据集通过涵盖地理、人口、冲突事件及治理效能等123个变量,为研究人员提供了深入分析冲突后社会重建与社区韧性的实证基础。其发布不仅丰富了非洲人道主义数据生态,也为机器学习在公共政策与安全研究中的应用开辟了新路径。
当前挑战
该数据集致力于解决冲突地区安全态势评估的复杂性问题,其核心挑战在于如何从高度异质且敏感的调查数据中提取可靠模式,以支持冲突预测或干预效果评估。构建过程中的挑战尤为显著:原始数据采集于动荡环境,可能存在报告偏差与定义不一致;数据包含大量分类变量与特殊缺失值编码(如888.0、777.0),增加了特征工程与清洗难度;样本规模有限(仅202行),制约了复杂模型的训练与泛化能力;此外,数据集的时空局限性使其难以捕捉动态安全趋势,而机构停运更导致后续验证与更新中断。
常用场景
经典使用场景
在冲突与安全研究领域,该数据集为学者提供了关于索马里基斯马尤地区社区安全状况的微观实证基础。其经典使用场景在于通过机器学习方法,对当地警务存在、司法服务使用及冲突事件等变量进行回归分析或分类建模,从而揭示影响社区安全感知的关键驱动因素。例如,研究者可依据性别、年龄、教育水平等人口统计特征,预测居民对警察的信任程度或犯罪报告倾向,进而构建区域安全风险评估模型。
衍生相关工作
围绕该数据集衍生的经典工作主要集中于非洲冲突数据的机器学习应用与比较研究。例如,Electric Sheep Africa 将其整合入更广泛的非洲安全数据集系列,促进了跨区域安全趋势的横向分析。相关研究利用该数据训练集成学习模型,以预测冲突复发风险;亦有学者将其与卫星遥感或社交媒体数据融合,构建多源冲突预警系统。这些工作深化了对索马里安全格局的理解,并为开源情报与计算社会科学交叉领域提供了方法论范例。
数据集最近研究
最新研究方向
在冲突与安全研究领域,基于人道主义数据的机器学习应用正成为前沿探索方向。Kismayo地区冲突与安全评估数据集作为索马里特定区域的微观实证资料,为研究者提供了分析脆弱环境下社区安全动态的宝贵窗口。当前研究聚焦于利用该数据集中的多维变量,如警务存在、司法服务可及性及冲突频率等,构建预测模型以评估安全风险演变趋势。这些模型旨在揭示社会经济因素与社区安全感之间的复杂关联,尤其在资源匮乏且制度脆弱的背景下,为早期预警系统和人道干预策略提供数据驱动的决策支持。随着非洲地区数字化转型的推进,此类数据集在促进区域安全治理智能化方面展现出深远意义,推动了跨学科方法在冲突预防与和平建设中的融合创新。
以上内容由遇见数据集搜集并总结生成



