Accurate ethnicity prediction from placental DNA methylation data
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE128827
下载链接
链接失效反馈官方服务:
资源简介:
The influence of genetics on DNA methylation (DNAme) variation is well documented, yet confounding from population stratification is often unaccounted for in DNAme association studies. Existing approaches have been developed to address confounding by population stratification by directly using DNAme data, but have not been validated in additional human populations or tissues, such as the placenta. Results: To aid future placental DNAme studies in assessing population stratification, we developed an ethnicity classifier, PLANET (placental elastic net DNAme ethnicity classifier), on combined Infinium Human Methylation 450k BeadChip array (HM450k) data from placental samples. We used data from five North American cohorts from private and public repositories (n = 509) and show that PLANET can not only accurately predict (accuracy = 0.9379, kappa = 0.8227) major classes of self-reported ethnicity/race (African: n = 58, Asian: n = 53, Caucasian: n = 389), but can also produce probabilities that are highly correlated with genetic ancestry inferred from genome-wide SNP (>2.5 million SNP) and ancestry informative markers (n=50) data. We found that PLANET’s ethnicity classification relies on 1860 DNAme microarray sites, and over half of these were also linked to nearby genetic polymorphisms (n=955). Lastly, we found our placental-optimized method outperforms existing approaches in assessing population stratification in our placental samples from individuals of Asian, African, and Caucasian ethnicities. Conclusion: PLANET outperforms existing methods and heavily relies on the genetic signal present in DNAme microarray data. PLANET can be used to address population stratification in future placental DNAme association studies, and will be especially useful when ethnicity information is missing and genotyping markers are unavailable. 508 samples were used to develop and validate PLANET, our placental ethnicity classifier. These samples are mixed in phenotype (healthy, preeclamptic, IUGR, a range of gestational ages across the third trimester and birthweights), as they come from multiple studies. For ethnicity and sex, these samples mainly consist of Caucasian, Asian, and African samples and are matched for sex (259 males and 249 females). This repository contains processed DNAm data for all 508 samples in the form of a beta value matrix, with a corresponding detection p value matrix, and sample-specific metadata information (e.g. ethnicity, sex). The majority of the 508 samples were collected from other GEO repositories; therefore only raw idat files for never-published-before samples were included in this repository. This amounts to 6 samples (12 idat files). Please see the samples metadata .xls file for a list of these samples.
创建时间:
2020-10-24



