Data from: An integrative approach to prioritize candidate causal genes for complex traits in cattle
收藏DataCite Commons2025-06-04 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.bcc2fqzph
下载链接
链接失效反馈官方服务:
资源简介:
Genome-wide association studies (GWAS) have identified many quantitative
trait loci (QTL) associated with complex traits, predominantly in
non-coding regions, posing challenges in pinpointing the causal variants
and their target genes. Three types of evidence can help identify the gene
through which QTL act: (1) proximity to the most significant GWAS variant,
(2) correlation of gene expression with the trait, and (3) the gene’s
physiological role in the trait. However, there is still uncertainty in
the success of these methods in identifying the correct genes. Here we
test the ability of these methods in a comparatively simple series of
traits associated with the concentration of polar lipids in milk. We
conducted single-trait GWAS for ~14 million imputed variants and 56
individual milk polar lipid (PL) phenotypes in 336 cows. A meta-analysis
of multi-trait GWAS identified 10,063 significant SNPs at FDR ≤ 10% (P ≤
7.15E-5). Transcriptome data from blood (~12.5K genes, 143 cows) and
mammary tissue (~12.2K genes, 169 cows) were analysed using the genetic
score omics regression (GSOR) method. This method links observed gene
expression to genetically predicted phenotypes and was used to find
associations between gene expression and 56 PL phenotypes. GSOR identified
2,186 genes in blood and 1,404 in mammary tissue associated with at least
one PL phenotype (FDR ≤ 1%). We partitioned the genome into
non-overlapping windows of 100 Kb to test for overlap between
GSOR-identified genes and GWAS signals. We found a significant overlap
between these two datasets, indicating that GSOR significant genes were
more likely to be located within 100 Kb windows that have GWAS signals
compared to those without (P = 0.01; odds ratio = 1.47). These windows
included 70 significant genes expressed in mammary tissue and 95 in blood.
Compared to all expressed genes in each tissue, these genes were enriched
for lipid metabolism gene ontology (GO). That is, 7 of the 70 significant
mammary transcriptome genes (P < 0.01; odds ratio = 3.98) and 5 of
the 95 significant blood genes (P < 0.10; odds ratio = 2.24) were
involved in lipid metabolism GO. The candidate causal genes include DGAT1,
ACSM5, SERINC5, ABHD3, CYP2U1, PIGL, ARV1, SMPD5, and NPC2, with some
overlap between the two tissues. The overlap between GWAS, GSOR, and GO
analyses suggests that together these methods can identify genes mediating
QTL, though their power remains limited, as reflected by modest odds
ratios. Larger sample sizes would enhance the power of these analyses, but
issues like linkage disequilibrium would remain.
提供机构:
Dryad
创建时间:
2025-06-04



