SARS-CoV-2 sgRNA in silico dataset
收藏DataCite Commons2026-03-10 更新2026-05-04 收录
下载链接:
https://data.jrc.ec.europa.eu/dataset/b52e86eb-2a75-433a-8112-24f06e1636ba
下载链接
链接失效反馈官方服务:
资源简介:
We generated 25 high-control synthetic Illumina paired-end sequencing datasets to systematically assess the performance of sgRNA detection pipelines (LeTRS, Periscope, sgDI-tector, sgRNAdetect) under realistic but controlled conditions.The collection comprises: 5 shotgun metagenomic datasets (125 bp reads, basic error model) and 20 amplicon datasets (ARTIC v.4 & v.5.3.2 schemes × MiSeq/HiSeq error models × 125/300 bp read lengths).
Five SARS-CoV-2 reference genomes were used, covering:Wild-type sequence, Single TRS-B mutation (A28260G) and 1–3 consecutive mutations in the early N gene (G28280A, A28295G, A28305G).
Each dataset contains exactly 500,000 non-sgRNA background reads + 100 carefully designed sgRNA-supporting paired-end reads (discordant/split reads).
The dataset enables direct quantification of tool sensitivity to mutation profiles, aligner choice (BWA vs HISAT2), sequencing strategy, read length, and ARTIC primer scheme version. Publicly available to support development, comparison and validation of current and future sgRNA detection methods.
提供机构:
European Commission, Joint Research Centre
创建时间:
2026-03-10



