five

SynGen6: Synthetic Genomic Dataset with Diverse Ancestry

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/syngen6-synthetic-genomic-dataset-diverse-ancestry
下载链接
链接失效反馈
官方服务:
资源简介:
SynGen6 is a synthetic genomic dataset that encompasses six distinct populations.  We utilized Principal Component Analysis (PCA) and ϵ-local differential privacy (LDP) to generate synthetic samples. We then simulated phenotype vectors associated with significant SNPs, mirroring real-world gene-disease associations. We also generated synthetic SNPs to watermark the dataset enabling verification of outsourced computations. Lastly, synthetic relatives were created to support research on kinship inference and family-based genomic analyses. The actual SynGen6 data can be created by runningour scripts in the All of Us Research Hub WorkBench. Here, we provide a toy example based on the 1000 genomes public dataset.
提供机构:
Vaidya, Jaideep; Wang, Xinyue; Min, Sitao
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作