Synthetic Datasets for Scenario-based Data Breaches
收藏doi.org2025-03-24 收录
下载链接:
http://doi.org/10.17632/sxfjgcynjv.1
下载链接
链接失效反馈官方服务:
资源简介:
We have synthetically generated synthetic datasets for scenario based data breaches. There are two kinds of datasets: 1. Master Record Table (MRT) which consists of 4 million records of the individuals profiled with several PIIs which is synthetically generated programatically; 2. 16 Scenario based datasets depicting various fictitious data breaches with varying number of records and PIIs are also distributed across for the variability. Furthermore we have also included the code such that practitioners, researchers and others can use the code further. This enables transparency in the form of reusability, reproducibility, and replicability.
本研究通过模拟生成了基于场景的数据泄露合成数据集。数据集分为两类:1. 主记录表(MRT),包含400万条个人资料的记录,其中包含多个合成生成的个人身份信息(PII);2. 16个基于场景的数据集,描绘了不同数量记录和PII的虚构数据泄露情况,以展现多样性。此外,我们还提供了相应的代码,以便从业者、研究人员及其他人员进一步使用。此举旨在通过可重用性、可复现性和可复制性,实现透明度。
提供机构:
doi.org



