Comparing methods for SNP calling from Genotyping-By-Sequencing (GBS) data for a large-genome conifer without a published genome sequence
收藏DataONE2021-03-08 更新2025-06-21 收录
下载链接:
https://search.dataone.org/view/sha256:4719f640571ff0974177bdc6406f972a71a3033f7ad05537ecd0f866da8ca743
下载链接
链接失效反馈官方服务:
资源简介:
Reduced-representation restriction-enzyme-based sequencing methods have been demonstrated to be robust and cost-effective genotyping methods to identify Single Nucleotide Polymorphisms (SNPs). While alignment of the short-read fragments to a genome sequence of the same species results in better SNP calling than de novo approaches, only a few tree species - and few conifers in particular - have an annotated sequence. Many conifer genomes are huge (>19 GB) and include a large proportion of repeat sequences, making assembly difficult. While the sequence of a related species could be used, choosing the proper pipeline for SNP calling is still challenging. Here we compare the performance of four bioinformatics pipelines, two of which require a reference genome (TASSEL-GBS V2 and Stacks), two of which are de novo pipelines (UNEAK and Stacks). We used Illumina GBS data from 94 ponderosa pines. Using loblolly pine genome as the reference greatly increased the number of SNPs called (62 -196 t...
创建时间:
2025-06-04



