electricsheepafrica/africa-idmc-event-data-for-ssd
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/africa-idmc-event-data-for-ssd
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- no-annotation
language_creators:
- found
language:
- en
license: cc-by-4.0
multilinguality:
- monolingual
size_categories:
- n<1K
source_datasets:
- original
task_categories:
- tabular-classification
- tabular-regression
- other
task_ids: []
tags:
- africa
- humanitarian
- hdx
- electric-sheep-africa
- conflict-violence
- displacement
- flooding
- internally-displaced-persons-idp
- ssd
pretty_name: "South Sudan - Internal Displacements Updates (IDU) (event data)"
dataset_info:
splits:
- name: train
num_examples: 264
- name: test
num_examples: 66
---
# South Sudan - Internal Displacements Updates (IDU) (event data)
**Publisher:** Internal Displacement Monitoring Centre (IDMC) · **Source:** [HDX](https://data.humdata.org/dataset/idmc-event-data-for-ssd) · **License:** `cc-by-igo` · **Updated:** 2026-04-09
---
## Abstract
Conflict and disaster population movement (flows) data for South Sudan.
The **IDU (Internal Displacement Updates) dataset**, provided by the [Internal Displacement Monitoring Centre (IDMC)](https://www.internal-displacement.org/), offers timely event data and provisional information on new internal displacements caused by conflicts and disasters. Representing the most recent available information over a 180-day time period, the IDU is updated daily and focuses on "flows" (new displacements).
Internally displaced persons (IDPs) are defined according to the [1998 Guiding Principles](https://www.internal-displacement.org/internal-displacement/guiding-principles-on-internal-displacement/) as people or groups of people who have been forced or obliged to flee or to leave their homes or places of habitual residence, in particular as a result of armed conflict, or to avoid the effects of armed conflict, situations of generalized violence, violations of human rights, or natural or human-made disasters and who have not crossed an international border. The IDMC's event data, sourced from the IDU, provides initial assessments of these internal displacements, reflecting continually updated provisional information from various sources.
While the IDU offers early insights, the more thoroughly validated and curated "stock" (Total number of people leaving on internal displacement) and "flow" (population movements) estimates are available in the annual [Global Internal Displacement Database (GIDD)](http://www.internal-displacement.org/database/displacement-data). Both datasets are accessible via API, with specific guidance on data access, structure, and limitations, including important preprocessing considerations for the IDU to ensure accurate analysis and avoid double-counting. For further detailed information and complete API specifications, users are encouraged to consult the official documentation at https://www.internal-displacement.org/database/api-documentation/.
The IDMC's Event data, sourced from the Internal Displacement Updates (IDU), offers initial assessments of internal displacements reported within the last 180 days. This dataset provides provisional information that is continually updated on a daily basis, reflecting the availability of data on new displacements arising from conflicts and disasters. The finalized, carefully curated, and validated estimates are then made accessible through [the Global Internal Displacement Database (GIDD)](https://www.internal-displacement.org/database/displacement-data). The IDU dataset comprises preliminary estimates aggregated from various publishers or sources.
Each row in this dataset represents discrete events or incidents. Temporal coverage is indicated by the `displacement_date`, `displacement_start_date` column(s). Geographic scope: **SSD**.
*Curated into ML-ready Parquet format by [Electric Sheep Africa](https://huggingface.co/electricsheepafrica).*
---
## Dataset Characteristics
| | |
|---|---|
| **Domain** | Conflict and security |
| **Unit of observation** | Discrete events or incidents |
| **Rows (total)** | 330 |
| **Columns** | 29 (6 numeric, 17 categorical, 5 datetime) |
| **Train split** | 264 rows |
| **Test split** | 66 rows |
| **Geographic scope** | SSD |
| **Publisher** | Internal Displacement Monitoring Centre (IDMC) |
| **HDX last updated** | 2026-04-09 |
---
## Variables
**Geographic** — `country` (South Sudan), `iso3` (SSD), `latitude` (range 4.1448–11.8341), `longitude` (range 27.3976–34.166), `displacement_type` (Conflict, Disaster) and 10 others.
**Temporal** — `event_start_date`, `event_end_date`.
**Identifier / Metadata** — `id` (range 220265.0–244247.0), `centroid` ([7.9779339, 31.95917545], [8.641529299999998, 32.14004335], [8.67997095, 32.205881500000004]), `event_id` (range 31206.0–40877.0), `event_name` (South Sudan: Non-International armed conflict (NIAC) - Countrywide - 2026, South Sudan: Non-International armed conflict (NIAC) - Countrywide - 2025, South Sudan: Flood - Unity - 31/05/2025), `sources` (IOM DTM South Sudan, Office for the Coordination of Humanitarian Affairs (OCHA), Office for the Coordination of Humanitarian Affairs (OCHA); The South Sudan Relief and Rehabilitation Commission (SSRRC)) and 2 others.
**Other** — `role` (Recommended figure, Triangulation), `qualifier` (total, more than or equal to, approximately), `figure` (range 66.0–180000.0), `created_at`, `description`.
---
## Quick Start
```python
from datasets import load_dataset
ds = load_dataset("electricsheepafrica/africa-idmc-event-data-for-ssd")
train = ds["train"].to_pandas()
test = ds["test"].to_pandas()
print(train.shape)
train.head()
```
---
## Schema
| Column | Type | Null % | Range / Sample Values |
|---|---|---|---|
| `id` | int64 | 0.0% | 220265.0 – 244247.0 (mean 238712.9303) |
| `country` | object | 0.0% | South Sudan |
| `iso3` | object | 0.0% | SSD |
| `latitude` | float64 | 0.0% | 4.1448 – 11.8341 (mean 7.6734) |
| `longitude` | float64 | 0.0% | 27.3976 – 34.166 (mean 31.5974) |
| `centroid` | object | 0.0% | [7.9779339, 31.95917545], [8.641529299999998, 32.14004335], [8.67997095, 32.205881500000004] |
| `role` | object | 0.0% | Recommended figure, Triangulation |
| `displacement_type` | object | 0.0% | Conflict, Disaster |
| `qualifier` | object | 0.0% | total, more than or equal to, approximately |
| `figure` | int64 | 0.0% | 66.0 – 180000.0 (mean 5668.5788) |
| `displacement_date` | datetime64[ns] | 0.0% | |
| `displacement_start_date` | datetime64[ns] | 0.0% | |
| `displacement_end_date` | datetime64[ns] | 0.0% | |
| `year` | int64 | 0.0% | 2025.0 – 2026.0 (mean 2025.7061) |
| `event_id` | int64 | 0.0% | 31206.0 – 40877.0 (mean 37834.7212) |
| `event_name` | object | 0.0% | South Sudan: Non-International armed conflict (NIAC) - Countrywide - 2026, South Sudan: Non-International armed conflict (NIAC) - Countrywide - 2025, South Sudan: Flood - Unity - 31/05/2025 |
| `event_start_date` | datetime64[ns] | 0.0% | |
| `event_end_date` | datetime64[ns] | 0.0% | |
| `sources` | object | 0.0% | IOM DTM South Sudan, Office for the Coordination of Humanitarian Affairs (OCHA), Office for the Coordination of Humanitarian Affairs (OCHA); The South Sudan Relief and Rehabilitation Commission (SSRRC) |
| `locations_name` | object | 0.0% | Pieri, Uror, Jonglei, South Sudan; Uror, Uror, Jonglei, South Sudan, Chuil, Nyirol, Jonglei, South Sudan; Nyambor, Nyirol, Jonglei, South Sudan, Chuil, Nyirol, Jonglei, South Sudan; Thol, Nyirol, Jonglei, South Sudan |
| `locations_coordinates` | object | 0.0% | 8.0450345, 32.0293535; 7.910833299999999, 31.8889974, 8.849584799999999, 32.3341201; 8.4334738, 31.9459666, 8.849584799999999, 32.3341201; 8.5103571, 32.0776429 |
| `locations_accuracy` | object | 0.0% | |
| `locations_type` | object | 0.0% | |
| `displacement_occurred` | object | 0.0% | |
| `created_at` | datetime64[ns, UTC] | 0.0% | |
| `description` | object | 0.0% | |
| `combined_type` | object | 0.0% | |
| `esa_source` | object | 0.0% | |
| `esa_processed` | object | 0.0% | |
---
## Numeric Summary
| Column | Min | Max | Mean | Median |
|---|---|---|---|---|
| `id` | 220265.0 | 244247.0 | 238712.9303 | 244082.5 |
| `latitude` | 4.1448 | 11.8341 | 7.6734 | 7.9793 |
| `longitude` | 27.3976 | 34.166 | 31.5974 | 31.864 |
| `figure` | 66.0 | 180000.0 | 5668.5788 | 1260.0 |
| `year` | 2025.0 | 2026.0 | 2025.7061 | 2026.0 |
| `event_id` | 31206.0 | 40877.0 | 37834.7212 | 39313.0 |
---
## Curation
Raw data was downloaded from HDX via the CKAN API and converted to Parquet. Column names were lowercased and standardised to snake_case. Common missing-value markers (`N/A`, `null`, `none`, `-`, `unknown`, `no data`, `#N/A`) were unified to `NaN`. 9 column(s) with >80% missing values were removed: `event_codes`, `event_code_types`, `category`, `subcategory`, `type`, `subtype`.... 6 column(s) were cast from string to numeric or datetime based on parse-success rate (>85% threshold). The dataset was split 80/20 into train and test partitions using a fixed random seed (42) and saved as Snappy-compressed Parquet.
---
## Limitations
- Data originates from Internal Displacement Monitoring Centre (IDMC) and has not been independently validated by ESA.
- Automated cleaning cannot correct for misreported values, definitional inconsistencies, or sampling bias in the original collection.
- Refer to the [original HDX dataset page](https://data.humdata.org/dataset/idmc-event-data-for-ssd) for the publisher's own methodology notes and caveats.
---
## Citation
```bibtex
@dataset{hdx_africa_idmc_event_data_for_ssd,
title = {South Sudan - Internal Displacements Updates (IDU) (event data)},
author = {Internal Displacement Monitoring Centre (IDMC)},
year = {2026},
url = {https://data.humdata.org/dataset/idmc-event-data-for-ssd},
note = {Repackaged for machine learning by Electric Sheep Africa (https://huggingface.co/electricsheepafrica)}
}
```
---
*[Electric Sheep Africa](https://huggingface.co/electricsheepafrica) — Africa's ML dataset infrastructure. Lagos, Nigeria.*
提供机构:
electricsheepafrica



