five

HAPNEST synthetic dataset

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://www.omicsdi.org/dataset/biostudies-other/S-BSST936
下载链接
链接失效反馈
官方服务:
资源简介:
This synthetic dataset contains genetics data for 1,008,000 individuals and 9 continuous phenotypic traits with various genetic architectures. The dataset includes 6 ancestry groups (AFR, AMR, CSA, EAS, EUR, MID) and over 6.8 million single nucleotide polymorphisms (SNPs) across 22 chromosomes. The data was generated using the HAPNEST software program (https://github.com/intervene-EU-H2020/synthetic_data) developed by members of the INTERVENE consortium (https://www.interveneproject.eu/). This software has been specifically designed to enable efficient, large-scale synthetic data generation for common genetic variants and complex phenotypic traits. We have open sourced this software so that anyone can easily generate their own synthetic datasets. Please see the linked GitHub repository for further details. The reference dataset used to generate this synthetic dataset is the combined 1000 Genomes Project and Human Genomic Diversity Project datasets downloaded from https://gnomad.broadinstitute.org/downloads. The data was preprocessed by retaining SNPs with non-zero MAF in all populations for which rsID numbers could be successfully aligned. This resulted in over 6.8 million variants across 22 chromosomes.
创建时间:
2023-03-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作