ProHealth: Kaiser Permanente GWAS of Prostate Cancer
收藏NIAID Data Ecosystem2026-04-25 收录
下载链接:
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001221.v1.p1
下载链接
链接失效反馈官方服务:
资源简介:
A genome-wide association study (GWAS) of prostate cancer (PCa) was conducted in Kaiser Permanente (KP) Northern California health plan members (7,783 cases, 38,595 controls; 80.3% non-Hispanic white, 4.9% African-American, 7.0% East Asian, and 7.8% Latino) [PMID: 26034056]. The data for these members were drawn from three KP cohort studies: Research Program in Genes, Environment and Health (RPGEH) ProHealth, and California Men's Health Study (CMHS) (described further under Study History). Four custom arrays were designed for genotyping, one for each of the four major race-ethnicity groups in the RPGEH cohort: African Americans, East Asians, Latinos, and Non-Hispanic Whites. The number of SNPs and SNP content varied by array, with SNP content designed to maximize the genome-wide coverage of low frequency and more common variants specific to the different race-ethnicity groups, including newly identified SNPs from sequencing projects, and SNPs with established associations with disease phenotypes and risk factors [PMIDs: 21565264, 21903159]. Within the total study cohort, n=34,736 completed a consent which permitted deposition of data to NIH. Genotyping followed the same general procedure described in [PMIDs: 26092718, plus additional quality control (QC) steps for the additional men, in order to control for potential batch and kit effects, described in [PMID: 26034056. Briefly, we first repeated the filters described in [PMID: 26092718] for all four arrays (EUR, LAT, EAS, AFR). Then, on an array-wise basis, we removed SNPs with MAF<0.01, with a call rate<95%, or with Hardy-Weinberg Equilibrium (HWE) p-value in homogeneous groups<1x10ˆ-5. Furthermore, on the EUR array, to adjust for potential kit effect, we conducted a GWAS of kit, and removed those kit associated SNPs with p<1x10ˆ-6; we also re-genotyped each of the new samples (those not genotyped with the original GERA data) with some of the original GERA data, and removed SNPs with >13/1,268 (1%) mismatches. For the AFR array, to adjust potential plate batch issues, we conducted a GWAS of whether an individual was in the original GERA data vs. in the newly genotyped data and removed those batch-associated SNPs with p<0.05 (we used a stronger threshold than that used for the EUR array because there were fewer individuals on the AFR array); we also re-genotyped each of the new samples with the original GERA data and removed SNPs with >2/78 (2.6%). After the QC described above, imputation was performed as described in [PMID: 26034056]. Imputation was performed on an array-wise basis, pre-phasing with SHAPE-IT v2.5 [PMID: 22138821], and imputing from the 1000 Genomes Project October 2014 release as a cosmopolitan reference panel with IMPUTE2 [PMID: 22384356]. In addition to the GWAS described above, a nested exome-wide association study (EWAS) of PCa was also conducted (7,489 cases, 7,323 controls; 78% non-Hispanic white, 9% African-American, 3% East Asian, 6% Latino, 4% Other). A custom EWAS array primarily focused on rare variants was designed for genotyping that complemented the GWAS arrays [PMID: 26034056]. The EWAS array content included missense and loss-of-function mutations, and rare exonic mutations from The Cancer Genome Atlas (TCGA) and dbGaP prostate cancer tumor exomes [PMID: 26544944; PMID: 26544944]. Much of the EWAS array design content overlapped with the probesets on the UK Biobank Affymetrix Axiom array [PMID: 30305743]. Genotyping and QC steps taken to filter out samples exhibiting low quality and variants with low call rates are described in Emami et al., 2020 [biorXiv]. The resulting EWAS array genotypes are provided here.]]>
Inclusion criteria for the data deposited in dbGaP include all of the following: Males included in KP RPGEH, ProHealth, or CMHS cohorts. Successfully genotyped from extracted DNA. For GWAS, DQC ≥ 0.82; call rate ≥ 0.97 (≥0.95 for African Americans). For EWAS, DQC ≥ 0.75; call rate ≥ 0.95. Provided explicit consent explicitly to have data deposited in NIH-maintained database. Exclusion criteria for the data deposited in dbGaP included any of the following: Subject requested withdrawal from study after DNA extraction and genotyping. Validity of link between biospecimen and study participant questionable because of genotype-phenotype discordance, e.g. gender. ]]>
Participants from three cohorts are included in this study and data. The Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) is a resource developed to facilitate research on genetic and environmental factors on a wide variety of common diseases and healthy aging. RPGEH links together data from the electronic medical records (EMR) of participants, survey data on demographic and behavioral factors, and environmental data from geographic information system databases, with genetic data derived from biospecimens from participating Health Plan members. The RPGEH study samples included men in the Genetic Epidemiology Research on Aging (GERA) sub-cohort (dbGaP phs000788), as well as PCa cases with a biosample in the RPGEH biorepository who were not part of GERA. The ProHealth study focused on ascertaining KP Northern California African-American PCa cases. The California Men's Health Study (CMHS) is a prospective cohort study of KP California men. PCa cases in all three cohorts were identified from the KP Northern California Cancer Registry (KPNCCR), the KP Southern California Cancer Registry (KPSCCR), or review of clinical electronic health records through the end of 2012. The Cancer Registries capture data on all PCa cases newly diagnosed or treated at KP facilities. The Cancer Registries conform to standards of the North American Association of Central Cancer Registries and the National Cancer Institute's Surveillance, Epidemiology and End Results (SEER) program. Controls were all men in GERA without PCa diagnosis as of 2012 (although they may have had other cancers). Our GWAS analyses included 7,783 cases and 38,595 controls. Of these, n=34,736 completed a consent which permitted deposition of data to NIH. Our EWAS analyses included 7,489 cases and 7,323 controls. Of these, n=12,985 completed a consent which permitted deposition of data to NIH. Funding. The prostate cancer project was supported by National Institutes of Health grants: CA127298, CA088164, CA112355 and NIH funding of the RPGEH RC2 project (RC2 AG036607). The RPGEH has also been supported by grants from philanthropic foundations, including the Wayne and Gladys Valley Foundation, the Ellison Medical Foundation, and the Robert Wood Johnson Foundation, as well as support from Kaiser Permanente, for work on disease registries, cohort enrollment, survey collection, and collection of biospecimens. The CMHS was originally supported by the California Cancer Research Program.]]>
创建时间:
2020-02-19



