five

SARS-CoV-2 sgRNA in silico dataset

收藏
DataCite Commons2026-03-10 更新2026-05-04 收录
下载链接:
https://data.jrc.ec.europa.eu/dataset/b52e86eb-2a75-433a-8112-24f06e1636ba
下载链接
链接失效反馈
官方服务:
资源简介:
We generated 25 high-control synthetic Illumina paired-end sequencing datasets to systematically assess the performance of sgRNA detection pipelines (LeTRS, Periscope, sgDI-tector, sgRNAdetect) under realistic but controlled conditions.The collection comprises: 5 shotgun metagenomic datasets (125 bp reads, basic error model) and 20 amplicon datasets (ARTIC v.4 & v.5.3.2 schemes × MiSeq/HiSeq error models × 125/300 bp read lengths). Five SARS-CoV-2 reference genomes were used, covering:Wild-type sequence, Single TRS-B mutation (A28260G) and 1–3 consecutive mutations in the early N gene (G28280A, A28295G, A28305G). Each dataset contains exactly 500,000 non-sgRNA background reads + 100 carefully designed sgRNA-supporting paired-end reads (discordant/split reads). The dataset enables direct quantification of tool sensitivity to mutation profiles, aligner choice (BWA vs HISAT2), sequencing strategy, read length, and ARTIC primer scheme version. Publicly available to support development, comparison and validation of current and future sgRNA detection methods.
提供机构:
European Commission, Joint Research Centre
创建时间:
2026-03-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作