Data from: Large-scale genotyping of highly polymorphic loci by next generation sequencing: how to overcome the challenges to reliably genotype individuals?
收藏DataONE2015-02-04 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Studying the different roles of adaptive genes is still a challenge in evolutionary ecology and requires reliable genotyping of large numbers of individuals. Next-generation sequencing (NGS) techniques enable such large-scale sequencing, but stringent data processing is required. Here, we develop an easy to use methodology to process amplicon-based NGS data and we apply this methodology to reliably genotype four major histocompatibility complex (MHC) loci belonging to MHC class I and II of Alpine marmots (Marmota marmota). Our post-processing methodology allowed us to increase the number of retained reads. The quality of genotype assignment was further assessed using three independent validation procedures. A total of 3069 high-quality MHC genotypes were obtained at four MHC loci for 863 Alpine marmots with a genotype assignment error rate estimated as 0.21%. The proposed methodology could be applied to any genetic system and any organism, except when extensive copy-number variation occurs (that is, genes with a variable number of copies in the genotype of an individual). Our results highlight the potential of amplicon-based NGS techniques combined with adequate post-processing to obtain the large-scale highly reliable genotypes needed to understand the evolution of highly polymorphic functional genes.
探究适应性基因的多样功能角色,仍是进化生态学领域的核心挑战之一,此类研究需对海量个体开展可靠的基因分型(genotyping)工作。下一代测序(next-generation sequencing,NGS)技术可为这类大规模测序工作提供支撑,但该流程需经过严格的数据处理环节。本研究开发了一款易用的分析方法,用于处理基于扩增子的NGS数据,并将该方法应用于对阿尔卑斯旱獭(Marmota marmota)的4类主要组织相容性复合体(major histocompatibility complex,MHC)基因座进行可靠分型,这些基因座隶属于MHC I类与MHC II类。我们所提出的后处理分析方法,有效提升了有效测序读段(reads)的保留数量。本研究进一步通过三项独立的验证流程,对基因分型结果的准确性进行了评估。本研究最终为863只阿尔卑斯旱獭的4个MHC基因座共获得3069条高质量MHC分型结果,经估算其基因分型错误率仅为0.21%。该方法可推广应用于任意遗传系统与生物物种,仅当研究对象存在广泛的拷贝数变异(copy-number variation)——即个体基因型内基因拷贝数存在可变数量的情况时除外。本研究结果证实,基于扩增子的NGS技术搭配合理的后处理流程,可获取大规模的高可靠基因分型数据,这为解析高度多态性功能基因的进化机制提供了有力支撑。
创建时间:
2015-02-04



