Data_Sheet_1_On Variant Discovery in Genomes of Fungal Plant Pathogens.docx

NIAID Data Ecosystem2026-03-11 收录

下载链接：

https://figshare.com/articles/dataset/Data_Sheet_1_On_Variant_Discovery_in_Genomes_of_Fungal_Plant_Pathogens_docx/12135462

下载链接

链接失效反馈

官方服务：

资源简介：

Comparative genome analyses of eukaryotic pathogens including fungi and oomycetes have revealed extensive variability in genome composition and structure. The genomes of individuals from the same population can exhibit different numbers of chromosomes and different organization of chromosomal segments, defining so-called accessory compartments that have been shown to be crucial to pathogenicity in plant-infecting fungi. This high level of structural variation confers a methodological challenge for population genomic analyses. Variant discovery from population sequencing data is typically achieved using established pipelines based on the mapping of short reads to a reference genome. These pipelines have been developed, and extensively used, for eukaryote genomes of both plants and animals, to retrieve single nucleotide polymorphisms and short insertions and deletions. However, they do not permit the inference of large-scale genomic structural variation, as this task typically requires the alignment of complete genome sequences. Here, we compare traditional variant discovery approaches to a pipeline based on de novo genome assembly of short read data followed by whole genome alignment, using simulated data sets with properties mimicking that of fungal pathogen genomes. We show that the latter approach exhibits levels of performance comparable to that of read-mapping based methodologies, when used on sequence data with sufficient coverage. We argue that this approach further allows additional types of genomic diversity to be explored, in particular as long-read third-generation sequencing technologies are becoming increasingly available to generate population genomic data.

对真菌、卵菌等真核病原微生物的比较基因组分析显示，其基因组组成与结构存在广泛变异。同一种群内不同个体的基因组，在染色体数目及染色体片段排布上可存在显著差异，由此形成所谓的附属基因组区室（accessory compartments），这类区室已被证实对植物致病真菌的致病性至关重要。这种高水平的基因组结构变异，给群体基因组学分析带来了方法论层面的挑战。目前，针对群体测序数据的变异挖掘工作，通常依托基于短读长序列（short reads）比对至参考基因组（reference genome）的成熟分析流程完成。这类流程已针对动植物真核基因组开发并得到广泛应用，用于检索单核苷酸多态性（single nucleotide polymorphisms, SNP）及短片段插入缺失变异。但这类流程无法实现大规模基因组结构变异的推断，因为该类分析通常需要完整基因组序列的比对。本研究利用属性与真菌病原基因组相似的模拟数据集，对比了传统变异挖掘方法与基于短读长数据从头基因组组装（de novo genome assembly）后开展全基因组比对（whole genome alignment）的分析流程。研究结果显示，当测序数据具备足够覆盖度时，后者的分析性能与基于读段比对的方法相当。我们认为，该方法还可支持更多类型的基因组多样性研究，尤其是随着长读长第三代测序技术愈发普及，可用于获取群体基因组学数据。

创建时间：

2020-04-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集