Comparing methods for SNP calling from Genotyping-By-Sequencing (GBS) data for a large-genome conifer without a published genome sequence

DataONE2021-03-08 更新2025-06-21 收录

下载链接：

https://search.dataone.org/view/sha256:4719f640571ff0974177bdc6406f972a71a3033f7ad05537ecd0f866da8ca743

下载链接

链接失效反馈

官方服务：

资源简介：

Reduced-representation restriction-enzyme-based sequencing methods have been demonstrated to be robust and cost-effective genotyping methods to identify Single Nucleotide Polymorphisms (SNPs). While alignment of the short-read fragments to a genome sequence of the same species results in better SNP calling than de novo approaches, only a few tree species - and few conifers in particular - have an annotated sequence. Many conifer genomes are huge (>19 GB) and include a large proportion of repeat sequences, making assembly difficult. While the sequence of a related species could be used, choosing the proper pipeline for SNP calling is still challenging. Here we compare the performance of four bioinformatics pipelines, two of which require a reference genome (TASSEL-GBS V2 and Stacks), two of which are de novo pipelines (UNEAK and Stacks). We used Illumina GBS data from 94 ponderosa pines. Using loblolly pine genome as the reference greatly increased the number of SNPs called (62 -196 t...

创建时间：

2025-06-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集