five

electricsheepafrica/africa-adaado-district-conflict-and-security-assessment-2015

收藏
Hugging Face2026-04-11 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-adaado-district-conflict-and-security-assessment-2015
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - no-annotation language_creators: - found language: - en license: cc-by-4.0 multilinguality: - monolingual size_categories: - n<1K source_datasets: - original task_categories: - tabular-classification - other task_ids: [] tags: - africa - humanitarian - hdx - electric-sheep-africa - conflict-violence - som pretty_name: "Adaado District Conflict and Security Assessment - 2015" dataset_info: splits: - name: train num_examples: 128 - name: test num_examples: 32 --- # Adaado District Conflict and Security Assessment - 2015 **Publisher:** Observatory of Conflict and Violence Prevention (inactive) · **Source:** [HDX](https://data.humdata.org/dataset/adaado-district-conflict-and-security-assessment-2015) · **License:** `cc-by-igo` · **Updated:** 2023-03-03 --- ## Abstract As part of its continual assessment of issues directly affecting community security and safety, OCVP conducted an extensive collection of primary data in ADAADO District - the capital city of the Galgadud region in central Somalia Each row in this dataset represents subnational administrative unit observations. Data was last updated on HDX on 2023-03-03. Geographic scope: **SOM**. *Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).* --- ## Dataset Characteristics | | | |---|---| | **Domain** | Public health | | **Unit of observation** | Subnational administrative unit observations | | **Rows (total)** | 160 | | **Columns** | 121 (37 numeric, 84 categorical, 0 datetime) | | **Train split** | 128 rows | | **Test split** | 32 rows | | **Geographic scope** | SOM | | **Publisher** | Observatory of Conflict and Violence Prevention (inactive) | | **HDX last updated** | 2023-03-03 | --- ## Variables **Geographic** — `region_name` (range 1.0–1.0), `district_name` (range 1.0–1.0), `reporting_petty_crime` (range 1.0–777.0), `reporting_petty_other` ( ), `police_yearly_trend` (range 1.0–888.0) and 24 others. **Demographic** — `village_name` (range 1.0–4.0), `gender_responder` (range 1.0–2.0), `age` (range 2.0–6.0). **Outcome / Measurement** — `number_of_stations` (range 1.0–888.0), `number_of_stations_other` ( ), `number_of_courts` (range 1.0–777.0), `number_of_courts_other` ( ), `number_of_conflicts` and 2 others. **Identifier / Metadata** — `legal_clinic_ref` ( ), `legal_clinic_ref_other`, `court_ref`, `court_ref_other`, `elders_ref` and 8 others. **Other** — `marital_status` (range 1.0–4.0), `level_education` (range 1.0–7.0), `police_presense` (range 1.0–888.0), `distance_to_station` (range 1.0–888.0), `reporting_civil` (range 1.0–777.0) and 64 others. --- ## Quick Start ```python from datasets import load_dataset ds = load_dataset("electricsheepafrica/africa-adaado-district-conflict-and-security-assessment-2015") train = ds["train"].to_pandas() test = ds["test"].to_pandas() print(train.shape) train.head() ``` --- ## Schema | Column | Type | Null % | Range / Sample Values | |---|---|---|---| | `region_name` | int64 | 0.0% | 1.0 – 1.0 (mean 1.0) | | `district_name` | int64 | 0.0% | 1.0 – 1.0 (mean 1.0) | | `village_name` | int64 | 0.0% | 1.0 – 4.0 (mean 2.4625) | | `gender_responder` | int64 | 0.0% | 1.0 – 2.0 (mean 1.3938) | | `age` | int64 | 0.0% | 2.0 – 6.0 (mean 3.4188) | | `marital_status` | int64 | 0.0% | 1.0 – 4.0 (mean 2.1) | | `level_education` | int64 | 0.0% | 1.0 – 7.0 (mean 4.45) | | `police_presense` | int64 | 0.0% | 1.0 – 888.0 (mean 64.7438) | | `number_of_stations` | float64 | 8.1% | 1.0 – 888.0 (mean 60.068) | | `number_of_stations_other` | object | 0.0% | | | `distance_to_station` | float64 | 8.1% | 1.0 – 888.0 (mean 102.068) | | `reporting_civil` | int64 | 0.0% | 1.0 – 777.0 (mean 17.6125) | | `reporting_civil_other` | object | 0.0% | | | `reporting_petty_crime` | int64 | 0.0% | 1.0 – 777.0 (mean 18.125) | | `reporting_petty_other` | object | 0.0% | | | `reporting_serious_crime` | int64 | 0.0% | 1.0 – 777.0 (mean 16.4438) | | `reporting_serious_other` | object | 0.0% | | | `trusted_sec_prov` | int64 | 0.0% | 1.0 – 777.0 (mean 11.8562) | | `trusted_sec_other` | object | 0.0% | | | `reason_for_choice_sec` | float64 | 1.2% | 1.0 – 5.0 (mean 2.4367) | | `reason_for_choice_sec_other` | object | 0.0% | , Dadku cabsiday gartan | | `level_trust_police` | int64 | 0.0% | 1.0 – 888.0 (mean 149.0938) | | `police_yearly_trend` | int64 | 0.0% | 1.0 – 888.0 (mean 140.5938) | | `court_presense` | int64 | 0.0% | 1.0 – 888.0 (mean 59.9125) | | `number_of_courts` | float64 | 9.4% | 1.0 – 777.0 (mean 55.3103) | | `number_of_courts_other` | object | 0.0% | | | `where_is_court` | float64 | 9.4% | 1.0 – 777.0 (mean 22.5931) | | `distance_to_court` | object | 0.0% | 1, , 2 | | `legal_clinic_aware` | int64 | 0.0% | | | `legal_clinic_use` | object | 0.0% | , 2 | | `legal_clinic_ref` | object | 0.0% | | | `legal_clinic_ref_other` | object | 0.0% | | | `legal_clinic_issue` | object | 0.0% | | | `legal_clinic_issue_other` | object | 0.0% | | | `legal_clinic_judgement` | object | 0.0% | | | `legal_clinic_enforced` | object | 0.0% | | | `court_use` | int64 | 0.0% | | | `court_ref` | object | 0.0% | | | `court_ref_other` | object | 0.0% | | | `court_issue` | object | 0.0% | | | `court_issue_other` | object | 0.0% | | | `court_judgement` | object | 0.0% | | | `court_enforced` | object | 0.0% | | | `elders_use` | int64 | 0.0% | | | `elders_ref` | object | 0.0% | | | `elders_ref_other` | object | 0.0% | | | `elders_issue` | object | 0.0% | | | `elders_issue_other` | object | 0.0% | | | `elders_judgement` | object | 0.0% | | | `elders_enforced` | object | 0.0% | | | `religious_use` | int64 | 0.0% | | | `religious_ref` | object | 0.0% | | | `religious_ref_other` | object | 0.0% | | | `religious_issue` | object | 0.0% | | | `religious_issue_other` | object | 0.0% | | | `religious_judgement` | object | 0.0% | | | `religious_enforced` | object | 0.0% | | | `trusted_just_prov` | int64 | 0.0% | | | `trusted_just_prov_other` | object | 0.0% | | | `reason_for_choice_just` | float64 | 9.4% | | | `reason_for_choice_just_other` | object | 0.0% | | | `conf_formal_just` | int64 | 0.0% | | | `court_yearly_trend` | int64 | 0.0% | | | `local_council_aware` | int64 | 0.0% | | | `aware_of_services` | float64 | 8.1% | | | `channels_comm` | float64 | 8.1% | | | `consultation_participation` | object | 0.0% | | | `participation_frequency` | object | 0.0% | | | `participation_frequency_other` | object | 0.0% | | | `elected_opinion` | int64 | 0.0% | | | `loc_gov_serviceseducation` | object | 0.0% | | | `loc_gov_serviceshealth` | object | 0.0% | | | `loc_gov_servicessecurity` | object | 0.0% | | | `loc_gov_servicesjustice` | object | 0.0% | | | `loc_gov_servicesagriculture` | object | 0.0% | | | `loc_gov_servicesinfrastructure` | object | 0.0% | | | `loc_gov_servicessanitation` | object | 0.0% | | | `loc_gov_serviceswater` | object | 0.0% | | | `loc_gov_servicesother` | object | 0.0% | | | `loc_gov_servicesdont_know` | object | 0.0% | | | `loc_gov_servicesrefused_to_answer` | object | 0.0% | | | `loc_gov_services_other` | object | 0.0% | | | `community_issueslack_of_water` | object | 0.0% | | | `community_issuesdrought` | object | 0.0% | | | `community_issueslack_of_infrastructure` | object | 0.0% | | | `community_issuespoor_sanitation` | object | 0.0% | | | `community_issuespoor_health` | object | 0.0% | | | `community_issuesunemployment` | object | 0.0% | | | `community_issuespoor_education` | object | 0.0% | | | `community_issuesshortage_of_electicity_supply` | object | 0.0% | | | `community_issuespoor_economy` | object | 0.0% | | | `community_issuescharcoal_production_deforestation` | object | 0.0% | | | `community_issuesbad_health_centers` | object | 0.0% | | | `community_issuesinsecurity` | object | 0.0% | | | `community_issuesgender_based_violence` | object | 0.0% | | | `community_issuesother` | object | 0.0% | | | `community_issuesdont_know` | object | 0.0% | | | `community_issuesrefused_to_answer` | object | 0.0% | | | `community_issues_other` | object | 0.0% | | | `council_yearly_trend` | float64 | 8.1% | | | `witnessed_conflict` | int64 | 0.0% | | | `number_of_conflicts` | object | 0.0% | | | `number_conf_violence` | object | 0.0% | | | `number_casualties` | object | 0.0% | | | `conflict_reasonresources` | object | 0.0% | | | `conflict_reasonfamily_disputes` | object | 0.0% | | | `conflict_reasoncrime` | object | 0.0% | | | `conflict_reasonpower` | object | 0.0% | | | `conflict_reasonrevenge` | object | 0.0% | | | `conflict_reasonbusiness_disputes` | object | 0.0% | | | `conflict_reasonrape` | object | 0.0% | | | `conflict_reasonlack_of_justice` | object | 0.0% | | | `conflict_reasonother` | object | 0.0% | | | `conflict_reasondont_know` | object | 0.0% | | | `conflict_reasonrefused_to_answer` | object | 0.0% | | | `conflict_reason_other` | object | 0.0% | | | `witnessed_crimes` | int64 | 0.0% | | | `how_safe` | int64 | 0.0% | | | `safety_yearly_trend` | int64 | 0.0% | | | `esa_source` | object | 0.0% | | | `esa_processed` | object | 0.0% | | --- ## Numeric Summary | Column | Min | Max | Mean | Median | |---|---|---|---|---| | `region_name` | 1.0 | 1.0 | 1.0 | 1.0 | | `district_name` | 1.0 | 1.0 | 1.0 | 1.0 | | `village_name` | 1.0 | 4.0 | 2.4625 | 2.0 | | `gender_responder` | 1.0 | 2.0 | 1.3938 | 1.0 | | `age` | 2.0 | 6.0 | 3.4188 | 3.0 | | `marital_status` | 1.0 | 4.0 | 2.1 | 2.0 | | `level_education` | 1.0 | 7.0 | 4.45 | 5.0 | | `police_presense` | 1.0 | 888.0 | 64.7438 | 1.0 | | `number_of_stations` | 1.0 | 888.0 | 60.068 | 1.0 | | `distance_to_station` | 1.0 | 888.0 | 102.068 | 1.0 | | `reporting_civil` | 1.0 | 777.0 | 17.6125 | 4.0 | | `reporting_petty_crime` | 1.0 | 777.0 | 18.125 | 4.0 | | `reporting_serious_crime` | 1.0 | 777.0 | 16.4438 | 1.0 | | `trusted_sec_prov` | 1.0 | 777.0 | 11.8562 | 2.0 | | `reason_for_choice_sec` | 1.0 | 5.0 | 2.4367 | 2.0 | --- ## Curation Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 9 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet. --- ## Limitations - Data originates from Observatory of Conflict and Violence Prevention (inactive) and has not been independently validated by ESA. - Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection. - Refer to the [original HDX dataset page](https://data.humdata.org/dataset/adaado-district-conflict-and-security-assessment-2015) for the publisher's own methodology notes and caveats. --- ## Citation ```bibtex @dataset{hdx_africa_adaado_district_conflict_and_security_assessment_2015, title = {Adaado District Conflict and Security Assessment - 2015}, author = {Observatory of Conflict and Violence Prevention (inactive)}, year = {2023}, url = {https://data.humdata.org/dataset/adaado-district-conflict-and-security-assessment-2015}, note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)} } ``` --- *[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica
搜集汇总
数据集介绍
main_image_url
构建方式
在冲突与安全研究领域,数据采集往往面临复杂的地缘政治挑战。本数据集由冲突与暴力预防观察站通过实地调查构建,聚焦于索马里中部加尔卡尤地区首府Adaado行政区。研究团队系统性地收集了社区层面的安全感知数据,涵盖警务存在、司法可及性、冲突事件等多个维度,共计160条观测记录,每条记录对应一个次国家级行政单位。原始数据经由人道主义数据交换平台发布,并由Electric Sheep Africa团队进行标准化处理,统一缺失值标记并优化数据类型,最终转化为适合机器学习分析的Parquet格式。
特点
该数据集在冲突研究领域展现出独特的结构特征。其121个变量涵盖地理标识、人口统计、安全设施分布及社区冲突动态四大类别,其中37个数值型变量与84个分类变量形成互补。值得注意的是,数据集采用标准化编码体系,如用特定数值范围代表分类选项(如1-2代表性别),同时保留原始调查中的特殊编码逻辑(如777代表特定缺失类型)。数据空间范围高度聚焦于单一行政区,时间维度定格在2015年的横截面观测,这种深度聚焦为微观层面的安全治理研究提供了稀缺的实证基础。
使用方法
对于从事冲突分析与公共政策研究的学者,本数据集可通过Hugging Face生态系统便捷获取。使用datasets库加载数据后,研究者可将其转换为Pandas DataFrame进行探索性分析。预置的80/20训练测试分割支持机器学习建模,适用于分类任务如安全态势预测或司法服务可及性评估。分析时需注意数值变量的特殊编码语义,并参考原始方法论说明理解数据收集背景。数据集的结构化特征使其既能服务于传统统计分析,也能为神经网络模型提供经过预处理的输入特征。
背景与挑战
背景概述
在冲突与安全研究领域,针对特定区域的微观层面数据收集对于理解社区安全动态至关重要。Adaado District Conflict and Security Assessment - 2015数据集由现已停止运营的冲突与暴力预防观察站于2015年创建,旨在评估索马里中部加尔卡尤地区首府Adaado区直接影响社区安全与稳定的问题。该数据集通过收集行政单位层面的观测数据,涵盖了警务存在、司法服务获取、冲突事件及社区议题等多维度变量,为研究人员提供了深入分析脆弱环境中安全治理与公共健康关联性的实证基础。其发布不仅丰富了非洲冲突研究的数据资源,也为人道主义干预和政策制定提供了关键参考。
当前挑战
该数据集致力于解决冲突地区安全评估与公共健康交叉领域的复杂问题,其核心挑战在于如何从有限且高度异质的社区数据中提取可靠模式,以支持冲突预测或干预效果评估。构建过程中面临多重困难:原始数据采集于动荡环境,可能存在报告偏差与定义不一致;样本规模较小且地理范围局限,限制了模型的泛化能力;变量中包含大量分类数据与缺失值标记,如888.0或777.0等特殊编码,增加了数据清洗与解释的复杂性。此外,数据发布机构已停止运作,独立验证与方法论细节获取困难,进一步加剧了数据可信度与可复现性的挑战。
常用场景
经典使用场景
在冲突与安全研究领域,该数据集为索马里阿达多地区的社区安全评估提供了结构化数据支撑。研究者通常利用其包含的121个变量,涵盖地理、人口统计、安全感知及司法服务使用等多维度信息,构建分类模型以预测社区冲突风险或安全态势演变。通过分析居民对警察信任度、犯罪报告频率及司法机构可及性等指标,能够揭示微观层面的安全动态,为区域稳定性研究提供实证基础。
衍生相关工作
基于该数据集衍生的经典研究多聚焦于非洲冲突预测与治理评估领域。学者利用其构建的机器学习模型,实现了对地方性暴力事件的风险图谱绘制;另有工作整合多地区安全评估数据,比较不同非正式司法机制在冲突调解中的效能。这些研究不仅拓展了计算社会科学在冲突分析中的应用,也为后续构建跨区域安全指标数据库提供了方法论参考。
数据集最近研究
最新研究方向
在冲突与安全研究领域,索马里等脆弱地区的社区安全评估数据正成为前沿探索的焦点。该数据集通过记录埃达多地区居民对警务、司法及冲突的感知,为理解非正式治理机制与安全动态的交互提供了微观实证基础。当前研究热点集中于利用此类小样本、高维度数据,结合机器学习方法预测社区暴力风险或评估干预措施效果,尤其在资源受限环境下的人道主义响应中,这类数据支撑的模型能够辅助优化资源分配与政策制定。其意义在于将传统定性评估转化为可量化的分析框架,为冲突预防与和平建设研究开辟了新的数据驱动路径。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务