five

Top twenty 10 kb regions in the human genome by estimated TMRCA.

收藏
NIAID Data Ecosystem2026-03-08 收录
下载链接:
https://figshare.com/articles/dataset/_Top_twenty_10_kb_regions_in_the_human_genome_by_estimated_TMRCA_/1028596
下载链接
链接失效反馈
官方服务:
资源简介:
aGenomic coordinates in hg19 assembly. The genome was simply partitioned into nonoverlapping 10 kb intervals in hg19 coordinates. bPosterior expected TMRCA in generations, averaged across unfiltered genomic positions in region. cNumber of polymorphisms in Complete Genomics dataset in region per kilobase of unfiltered sequence. dNormalized polymorphism rate: number of polymorphisms per unfiltered kilobase divided first by the local mutation rate (as estimated from divergence to nonhuman primate outgroup genomes) then by the average of the same polymorphism/divergence ratio in designated neutral regions. The resulting value can be interpreted as a fold increase in the mutation-normalized polymorphism rate compared with the expectation under neutrality. The same measure was computed from the much larger 1000 Genomes Project Phase 1 data set, and was significantly elevated in these 20 high-TMRCA regions (Supplementary Figure S13). ePossible copy number variant (CNV), based on Complete Genomics “hypervariable” or “invariant” labels (see Methods). Polymorphism rates in these regions may be over-estimated.

a. 采用hg19基因组组装版本的坐标信息。将该基因组依据hg19坐标划分为互不重叠的10 kb区间。 b. 后验期望最近共同祖先时间(TMRCA, Time to the Most Recent Common Ancestor),以世代为单位,为区域内未过滤基因组位点的平均值。 c. 区域内每千碱基未过滤序列中,Complete Genomics数据集所包含的多态性位点数目。 d. 标准化多态性率:将每千碱基未过滤序列的多态性位点数目,首先除以局部突变率(通过与非人灵长类外类群基因组的序列分化度估算得到),再除以指定中性区域内相同多态性/分化度比值的平均值。所得数值可解释为:相较于中性演化预期下的突变标准化多态性率,实际观测值的倍数提升。本研究还基于规模远大于前者的1000 Genomes Project Phase 1(1000基因组计划第一阶段)数据集重复计算了该指标,结果显示在这20个高TMRCA区域中,该指标显著升高(补充图S13)。 e. 基于Complete Genomics标注的“高变异性”(hypervariable)与“低变异性”(invariant)标签判定的潜在拷贝数变异(CNV, Copy Number Variant)。此类区域的多态性率可能被高估。
创建时间:
2014-05-15
二维码
社区交流群
二维码
科研交流群
商业服务