five

Linked-read sequencing enables haplotype-resolved resequencing at population scale

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/ERP121631
下载链接
链接失效反馈
官方服务:
资源简介:
The feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences – including the quantification and dating of admixture, introgression and demographic events, and inference of selective sweeps – are still limited by the lack of high-quality haplotype information. The newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype-resolved genome resequencing at population scale, we investigated properties of linked-read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25x, 20x, 15x, 10x, 7x, and 5x) with high-coverage data (46-68x) of seven bird genomes mapped to a reference suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15x coverage, phased haplotypes span about 90% of the genome assembly, with 50 and 90 percent of phased sequences located in phase blocks longer than 1.25-4.6 Mb (N50) and 0.27-0.72 Mb (N90). Phasing accuracy reaches beyond 99% starting from 15x coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1Mb (N50/N90) at 25x coverage), but only marginally improved phasing accuracy. Phase block contiguity improved with input DNA molecule length; thus, higher-quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase-sized genomes like birds, linked-read sequencing at moderate depth opens an affordable avenue towards haplotype-resolved genome resequencing at population scale.

对几乎任意生物体开展全基因组测序的可行性,为解析种群与物种的演化历史提供了前所未有的洞见。然而,诸多群体基因组学推断——包括对遗传混合、基因渐渗与种群动态事件的量化与定年,以及选择性清除的推断——仍受限于高质量单倍型信息的匮乏。新一代测序技术如今有望实现显著突破。为验证种群尺度下单倍型解析(haplotype-resolved)基因组重测序的可行性,我们针对鵖属(Oenanthe)鸣禽的多测序深度连锁读段(linked-read)测序数据展开了特征分析。我们以7个映射至参考基因组的鸟类基因组的降采样数据(25×、20×、15×、10×、7×和5×)与高覆盖度数据(46~68×)的比对分析为基础,结果表明:仅需适度的测序投入,即可获得满足多数群体基因组学分析需求的定相连续性与定相准确率。当测序覆盖度为15×时,定相单倍型可覆盖约90%的基因组组装序列,其中50%与90%的定相序列位于长度超过1.25~4.6 Mb(N50)与0.27~0.72 Mb(N90)的定相区块中。当覆盖度达到15×及以上时,定相准确率可突破99%。更高的覆盖度可获得更佳的定相连续性(25×覆盖度下最高可达约7 Mb/1 Mb(N50/N90)),但对定相准确率的提升仅微乎其微。定相区块的连续性随输入DNA分子长度的增加而提升,因此,使用更高质量的DNA可助力控制测序成本。综上,即便对于鸟类这类具有吉碱基级基因组的生物体,适度测序深度下的连锁读段测序技术,也为种群尺度的单倍型解析基因组重测序提供了一条经济可行的路径。
创建时间:
2021-02-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作