five

Simulated reads for benchmarking SARS-CoV-2 lineage abundance estimation

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8298886
下载链接
链接失效反馈
官方服务:
资源简介:
To evaluate the accuracy of lineage abundance estimates from amplicon-based and whole genome-based sequencing, we simulated paired-end reads from amplicons determined by AmpliDiff, and reads spanning full genomes. Abundances of lineages are based on the relative abundance of a lineage within the dataset. The data consists of the following 8 independent datasets: 200 bp reads from the Netherlands based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage, 400 bp reads from the Netherlands based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage, 200 bp reads from the Netherlands based on whole genome sequencing at 100x coverage, 400 bp reads from the Netherlands based on whole genome sequencing at 100x coverage, 200 bp reads from Texas based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage, 400 bp reads from Texas based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage, 200 bp reads from Texas based on whole genome sequencing at 100x coverage, 400 bp reads from Texas based on whole genome sequencing at 100x coverage. Every independent dataset contains 20 sets of reads (generated with different random seeds). The genomes used for the Netherlands-based simulations can be obtained via GISAID through accession id EPI_SET_230825fe, and the genomes used for the Texas-based simulations can be obtained via GISAID through accession id EPI_SET_230825pe.
创建时间:
2023-08-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作