five

electricsheepafrica/african-data-breach-registry

收藏
Hugging Face2026-03-21 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/electricsheepafrica/african-data-breach-registry
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 tags: - cybersecurity - data-breach - privacy - sub-saharan-africa - synthetic - regulatory language: - en pretty_name: African Data Breach Registry size_categories: - 10K<n<100K task_categories: - tabular-regression - classification --- # African Data Breach Registry Synthetic dataset of data breach incidents across 12 Sub-Saharan African countries, modeled on real regulatory frameworks including Nigeria's NDPPA, Kenya's ODPC, and South Africa's POPIA. ## Dataset Description - **3 scenarios**: `baseline`, `enforcement_improved`, `breach_surge` - **10,000 records per scenario** (30,000 total) - **12 countries**: Nigeria, Kenya, South Africa, Ghana, Tanzania, Uganda, Rwanda, Ethiopia, Senegal, Côte d'Ivoire, Zambia, Mozambique - **Time period**: 2021–2025 ## Variables | Variable | Type | Description | |---|---|---| | record_id | int | Unique record identifier | | country | str | Country name | | dpa | str | Data Protection Authority | | year | int | Breach year | | month | int | Breach month | | breach_date | date | Date breach occurred (YYYY-MM-DD) | | discovery_date | date | Date breach was discovered | | notification_date | date | Date breach was reported | | sector | str | Affected sector (financial/healthcare/telecom/government/retail/education) | | breach_type | str | Attack vector (hacking/insider/malware/phishing/physical/accidental_disclosure) | | records_exposed | int | Number of records compromised | | records_with_pii | int | Records containing PII | | data_types | str | Pipe-separated data types (names/emails/phones/financial/health/biometric) | | notification_delay_days | int | Days from breach to notification | | regulatory_response | str | Regulatory action (investigation/fine/none) | | remediation_cost_usd | float | Estimated remediation cost (USD) | | public_disclosure_flag | int | Whether breach was publicly disclosed (0/1) | | dpa_notification_flag | int | Whether DPA was notified (0/1) | | breach_severity | str | Severity class (minor/moderate/major/critical) | | affected_individuals_thousands | float | Affected individuals in thousands | ## Scenarios - **baseline**: Current regulatory enforcement levels - **enforcement_improved**: Strengthened DPA capacity, faster notification, higher fines - **breach_surge**: Increased attack volume with weakened enforcement ## Regulatory Context The dataset models enforcement dynamics of: - **Nigeria**: Nigeria Data Protection and Privacy Act (NDPPA) / NDPR - **Kenya**: Data Protection Act 2019 / Office of the Data Protection Commissioner (ODPC) - **South Africa**: Protection of Personal Information Act (POPIA) / Information Regulator - Other countries with emerging or absent data protection frameworks ## Usage ```python import pandas as pd baseline = pd.read_csv("data/baseline.csv") enforcement = pd.read_csv("data/enforcement_improved.csv") surge = pd.read_csv("data/breach_surge.csv") ``` ## Generate & Validate ```bash pip install -r requirements.txt python generate_dataset.py python validate_dataset.py ``` ## License CC-BY-4.0
提供机构:
electricsheepafrica
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作