Most damaging CADD scores for hg19 human genome build (CADD scores generated with bStatistic removed)
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.n8pk0p2x0
下载链接
链接失效反馈官方服务:
资源简介:
Analyses of genetic variation in many taxa have established that neutral genetic diversity is shaped by natural selection at linked sites. Whether the mode of selection is primarily the fixation of strongly beneficial alleles (selective sweeps) or purifying selection on deleterious mutations (background selection) remains unknown, however. We address this question in humans by fitting a model of the joint effects of selective sweeps and background selection to autosomal polymorphism data from the 1000 Genomes Project. After controlling for variation in mutation rates along the genome, a model of background selection alone explains ~60% of the variance in diversity levels at the megabase scale. Adding the effects of selective sweeps driven by adaptive substitutions to the model does not improve the fit, and when both modes of selection are considered jointly, selective sweeps are estimated to have had little or no effect on linked neutral diversity. The regions under purifying selection are best predicted by phylogenetic conservation, with ~80% of the deleterious mutations affecting neutral diversity occurring in non-exonic regions. Thus, background selection is the dominant mode of linked selection in humans, with marked effects on diversity levels throughout autosomes.
Methods
This dataset contains CADD scores separated into each of the human autosomes in NumPy zipped files (.npz format). The scores are based on the most damaging CADD score for each site in the genome. These most damaging scores per site were based on a custom set of CADD scores generated by the Kircher Lab, which maintains and updates the CADD project. They removed the bStatistic input (based on McVicker's B) from the set of annotations used to generate CADD scores, since some of our work infers new B scores using CADD as an input, and we wanted to avoid the circularity of building new B scores using an annotation that includes the old B scores.
创建时间:
2023-08-31



