Simulated reads for benchmarking SARS-CoV-2 lineage abundance estimation
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8298886
下载链接
链接失效反馈官方服务:
资源简介:
To evaluate the accuracy of lineage abundance estimates from amplicon-based and whole genome-based sequencing, we simulated paired-end reads from amplicons determined by AmpliDiff, and reads spanning full genomes. Abundances of lineages are based on the relative abundance of a lineage within the dataset. The data consists of the following 8 independent datasets:
200 bp reads from the Netherlands based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage,
400 bp reads from the Netherlands based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage,
200 bp reads from the Netherlands based on whole genome sequencing at 100x coverage,
400 bp reads from the Netherlands based on whole genome sequencing at 100x coverage,
200 bp reads from Texas based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage,
400 bp reads from Texas based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage,
200 bp reads from Texas based on whole genome sequencing at 100x coverage,
400 bp reads from Texas based on whole genome sequencing at 100x coverage.
Every independent dataset contains 20 sets of reads (generated with different random seeds). The genomes used for the Netherlands-based simulations can be obtained via GISAID through accession id EPI_SET_230825fe, and the genomes used for the Texas-based simulations can be obtained via GISAID through accession id EPI_SET_230825pe.
创建时间:
2023-08-30



