Most damaging CADD scores for hg19 human genome build (CADD scores generated with bStatistic removed)

DataONE2023-09-01 更新2024-06-08 收录

下载链接：

https://search.dataone.org/view/sha256:b84820cac0a9ab91ae4a8947b7f2d050d8353f795cb2ca6a6d29fa391ba980fc

下载链接

链接失效反馈

官方服务：

资源简介：

Analyses of genetic variation in many taxa have established that neutral genetic diversity is shaped by natural selection at linked sites. Whether the mode of selection is primarily the fixation of strongly beneficial alleles (selective sweeps) or purifying selection on deleterious mutations (background selection) remains unknown, however. We address this question in humans by fitting a model of the joint effects of selective sweeps and background selection to autosomal polymorphism data from the 1000 Genomes Project. After controlling for variation in mutation rates along the genome, a model of background selection alone explains ~60% of the variance in diversity levels at the megabase scale. Adding the effects of selective sweeps driven by adaptive substitutions to the model does not improve the fit, and when both modes of selection are considered jointly, selective sweeps are estimated to have had little or no effect on linked neutral diversity. The regions under purifying selection ..., This dataset contains CADD scores separated into each of the human autosomes in NumPy zipped files (.npz format). The scores are based on the most damaging CADD score for each site in the genome. These most damaging scores per site were based on a custom set of CADD scores generated by the Kircher Lab, which maintains and updates the CADD project. They removed the bStatistic input (based on McVicker's B) from the set of annotations used to generate CADD scores, since some of our work infers new B scores using CADD as an input, and we wanted to avoid the circularity of building new B scores using an annotation that includes the old B scores., , # Most damaging CADD scores for hg19 human genome build (CADD scores generated with bStatistic removed) *** This dataset contains CADD scores separated into each of the human autosomes in NumPy zipped files (.npz format). The scores are based on the most damaging CADD score for each site in the genome. These most damaging scores per site were based on a custom set of CADD scores generated by the Kircher Lab, which maintains and updates the CADD project. They removed the bStatistic input (based on McVicker's B) from the set of annotations used to generate CADD scores, since some of our work infers new B scores using CADD as an input, and we wanted to avoid the circularity of building new B scores using an annotation that includes the old B scores. Original CADD scores that these were derived from can be found at: https://cadd.gs.washington.edu/ These files must be opened using NumPy. They are formatted such that for each chromosome the most damaging CADD score is given for each chrom...

创建时间：

2023-11-29