five

Data from: Allele phasing has minimal impact on phylogenetic reconstruction from targeted nuclear gene sequences in a case study of Artocarpus

收藏
DataONE2018-05-08 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Premise of the study: Untapped information about allelic diversity within populations and individuals (i.e. heterozygosity) could improve phylogenetic resolution and accuracy. Many phylogenetic reconstructions ignore heterozygosity because it is difficult to assemble allele sequences and combine allelic data across unlinked loci and it is unclear how reconstruction methods accommodate variable sequences. We review the common methods of including heterozygosity in phylogenetic studies and present a novel method for assembling allele sequences from target enriched Illumina sequencing libraries. Methods: We perform supermatrix phylogeny reconstruction and species tree estimation of Artocarpus based on three methods of accounting for heterozygous sequences: a consensus method based on de novo sequence assembly, the use of ambiguity characters, and a novel method for phasing alleles. We characterize the extent to which highly heterozygous sequences impeded phylogeny reconstruction and determine whether the use of allele sequences improves resolution or decreases topological uncertainty. Key Results: We show that it is possible to infer phased alleles from target enriched Illumina libraries. We find that highly heterozygous sequences do not contribute disproportionately to poor phylogenetic resolution and that the use of allele sequences for phylogeny reconstruction does not have a clear effect on phylogenetic resolution or topological consistency. Conclusions: We provide a framework for inferring phased alleles from target enrichment data and for assessing the contribution of allelic diversity to phylogenetic reconstruction. In our dataset, the impact of allele phasing on phylogeny is minimal compared to the impact of using phylogenetic reconstruction methods that account for gene tree incongruence.

研究背景:种群与个体内的等位基因多样性(allelic diversity)所蕴含的未被开发利用的信息,可提升系统发育分辨率(phylogenetic resolution)与推断准确性。当前多数系统发育重建(phylogenetic reconstruction)研究均未考虑杂合性(heterozygosity),原因在于等位基因序列的组装、跨非连锁位点的等位基因数据整合难度较高,且现有重建方法如何适配可变序列的机制尚不明确。本研究综述了系统发育研究中纳入杂合性信息的常用方法,并提出了一种从目标富集Illumina测序文库中组装等位基因序列的全新方法。 研究方法:本研究以面包果属(Artocarpus)为研究对象,基于三种处理杂合序列的方法开展超级矩阵系统发育重建(supermatrix phylogeny reconstruction)与物种树估计(species tree estimation):基于从头(de novo)序列组装的共识法、模糊字符(ambiguity characters)使用法,以及一种全新的等位基因定相(phasing alleles)方法。本研究还量化了高杂合序列对系统发育重建的阻碍程度,并明确了使用等位基因序列是否能够提升系统发育分辨率或降低拓扑结构不确定性(topological uncertainty)。 主要结果:本研究证实,可从目标富集Illumina测序文库中推断出定相后的等位基因。研究发现,高杂合序列并不会不成比例地导致系统发育分辨率下降;且使用等位基因序列进行系统发育重建,对系统发育分辨率或拓扑结构一致性(topological consistency)并无显著影响。 研究结论:本研究构建了一套从目标富集数据中推断定相等位基因、并评估等位基因多样性对系统发育重建贡献度的分析框架。在本研究的数据集当中,相较于采用考虑基因树冲突的系统发育重建方法所带来的影响,等位基因定相对系统发育分析的影响微乎其微。
创建时间:
2018-05-08
二维码
社区交流群
二维码
科研交流群
商业服务