five

Comparing methods for SNP calling from Genotyping-By-Sequencing (GBS) data for a large-genome conifer without a published genome sequence

收藏
DataONE2021-03-08 更新2025-06-21 收录
下载链接:
https://search.dataone.org/view/sha256:4719f640571ff0974177bdc6406f972a71a3033f7ad05537ecd0f866da8ca743
下载链接
链接失效反馈
官方服务:
资源简介:
Reduced-representation restriction-enzyme-based sequencing methods have been demonstrated to be robust and cost-effective genotyping methods to identify Single Nucleotide Polymorphisms (SNPs). While alignment of the short-read fragments to a genome sequence of the same species results in better SNP calling than de novo approaches, only a few tree species - and few conifers in particular - have an annotated sequence. Many conifer genomes are huge (>19 GB) and include a large proportion of repeat sequences, making assembly difficult. While the sequence of a related species could be used, choosing the proper pipeline for SNP calling is still challenging. Here we compare the performance of four bioinformatics pipelines, two of which require a reference genome (TASSEL-GBS V2 and Stacks), two of which are de novo pipelines (UNEAK and Stacks). We used Illumina GBS data from 94 ponderosa pines. Using loblolly pine genome as the reference greatly increased the number of SNPs called (62 -196 t...
创建时间:
2025-06-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作