A Mesolithic Aurochs Genome and Early Medieval Cow Genome. A high coverage Mesolithic aurochs genome and effective leveraging of ancient cattle genomes using whole genome imputation.
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJEB74338
下载链接
链接失效反馈官方服务:
资源简介:
Ancient genomic analyses are often restricted to utilising pseudo-haploid data due to low genome coverage. Leveraging low coverage data by imputation to calculate phased diploid genotypes that enable haplotype-based interrogation and SNP calling at unsequenced positions is highly desirable. This has not been investigated for ancient cattle genomes despite these being compelling subjects for archaeological, evolutionary and economic reasons. Here we test this approach by sequencing a Mesolithic European aurochs (18.49x; 9852-9376 calBC), an Early Medieval European cow (18.69x; 427-580 calAD), and combine these with published individuals; two ancient and three modern. We downsample these genomes (0.25x, 0.5x, 1.0x, 2.0x) and impute diploid genotypes, utilising a reference panel of 171 published modern cattle genomes that we curated for 21.7 million (Mn) phased single-nucleotide polymorphisms (SNPs). We recover high densities of correct calls with an accuracy of >99.1% at variant sites for the lowest downsample depth of 0.25x, increasing to >99.5% for 2.0x (transversions only, minor allele frequency (MAF) ≥2.5%). The recovery of SNPs correlates with coverage, on average 58% of sites are recovered for 0.25x increasing to 87% for 2.0x, utilising an average of 3.5 million (Mn) transversions (MAF ≥2.5%), even in the aurochs which is temporally and morphologically distinct from the reference panel. Our imputed genomes behave similarly to directly called data in allele-frequency-based analyses; for example consistently identifying runs of homozygosity >2mb, including a long homozygous region in the Mesolithic European aurochs.
古基因组分析(Ancient genomic analyses)常因基因组覆盖度偏低,仅能使用伪单倍型(pseudo-haploid)数据。通过基因型填充(imputation)手段对低覆盖度数据进行利用,以计算相位化二倍体基因型,进而支持基于单倍型的分析以及未测序位点的单核苷酸多态性(SNP,single-nucleotide polymorphism)分型,是学界长期以来极具应用价值的研究方向。尽管古代牛基因组在考古学、演化生物学与经济学领域均为极具吸引力的研究对象,但此前尚未有针对该类样本开展此类填充方法的相关研究。本研究通过测序1份中石器时代欧洲原牛(测序深度18.49×,年代为校正日历公元前9852-9376年)与1份中世纪早期欧洲家牛(测序深度18.69×,年代为校正日历公元427-580年),并将这两份数据与已发表的5份个体数据(2份古代样本、3份现代样本)进行合并分析。我们对上述基因组进行降采样处理,设置降采样深度分别为0.25×、0.5×、1.0×、2.0×,并利用我们针对2170万(Mn)个相位化单核苷酸多态性(SNPs)整理得到的、包含171个已发表现代牛基因组的参考面板,开展二倍体基因型填充工作。针对最低降采样深度0.25×的样本,我们在变异位点上的正确分型准确率超过99.1%;当深度提升至2.0×时,准确率可提升至99.5%以上(仅统计颠换变异,次要等位基因频率(minor allele frequency, MAF)≥2.5%)。SNP位点的回收率与覆盖度呈正相关:在仅统计颠换变异、次要等位基因频率≥2.5%的前提下,平均使用350万(Mn)个颠换变异位点时,0.25×深度下平均可回收58%的位点,而2.0×深度下回收率可提升至87%,即使是在与参考面板在时间与形态上均存在显著差异的原牛样本中亦是如此。经填充得到的基因组在基于等位基因频率的分析中,表现与直接测序获得的数据高度相似:例如可稳定识别长度超过2Mb的纯合子区域(runs of homozygosity, ROH),其中包括中石器时代欧洲原牛中存在的一段长纯合区域。
创建时间:
2024-05-08



