five

Data from: Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans

收藏
DataONE2018-03-29 更新2024-06-25 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. However, it is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We investigate a number of questions pertaining to the distribution of mutations using more than 130,000 DNMs from three large datasets. We demonstrate that the amount and pattern of variation differs between datasets at the 1MB and 100KB scales probably as a consequence of differences in sequencing technology and processing. In particular, datasets show different patterns of correlation to genomic variables such as replication time. Never-the-less there are many commonalities between datasets, which likely represent true patterns. We show that there is variation in the mutation rate at the 100KB, 1MB and 10MB scale that cannot be explained by variation at smaller scales, however the level of this variation is modest at large scales–at the 1MB scale we infer that ~90% of regions have a mutation rate within 50% of the mean. Different types of mutation show similar levels of variation and appear to vary in concert which suggests the pattern of mutation is relatively constant across the genome. We demonstrate that variation in the mutation rate does not generate large-scale variation in GC-content, and hence that mutation bias does not maintain the isochore structure of the human genome. We find that genomic features explain less than 40% of the explainable variance in the rate of DNM. As expected the rate of divergence between species is correlated to the rate of DNM. However, the correlations are weaker than expected if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. In contrast to divergence, we find that most of the variation in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered.

长期以来,学界基于人类与其他物种的序列分化程度,推测人类基因组的突变率存在大范围异质性。然而如今,借助通过核心家系测序在人类群体中发现的海量新发突变(de novo mutations, DNMs),我们可直接对这一假说展开验证。本研究依托来自三个大型数据集的逾13万条新发突变,针对突变分布相关的一系列问题展开系统性探究。本研究证实,在1MB与100KB的基因组尺度下,不同数据集的变异程度与变异模式存在差异,该现象大概率源于测序技术及分析流程的差异。具体而言,不同数据集与复制时间等基因组变量的相关模式存在显著区别。尽管如此,各数据集间仍存在诸多共性,这些共性大概率反映了真实的突变分布特征。本研究表明,在100KB、1MB及10MB尺度下,突变率存在无法通过更小尺度的变异加以解释的异质性,但该类异质性在大尺度下程度相对温和:在1MB尺度下,我们推断约90%的区域的突变率处于平均值的±50%范围内。不同类型的突变表现出相似程度的异质性,且变异趋势高度一致,这意味着全基因组范围内的突变模式相对恒定。本研究证实,突变率的异质性并不会引发GC含量的大范围波动,因此突变偏好性并非维持人类基因组等容区(isochore)结构的核心因素。我们发现,各类基因组特征仅能解释新发突变率中可解释变异的40%以下。正如预期,物种间的序列分化速率与新发突变率存在显著相关性。但倘若所有分化变异均由突变率变异所导致,则相关程度弱于预期。本研究提供证据表明,该现象源于偏倚基因转换(biased gene conversion)对突变固定概率的影响。与序列分化不同,我们发现遗传多样性的大部分变异可通过突变率的异质性得到解释。最后,本研究证实,随着所比对物种的分化程度不断升高,序列分化与新发突变密度间的相关性会逐渐减弱。
创建时间:
2018-03-29
二维码
社区交流群
二维码
科研交流群
商业服务