Artificial real metagenomic reads
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10472795
下载链接
链接失效反馈官方服务:
资源简介:
We created Nanopore and Illumina metagenomic datasets by combining real (non-simulated) sequencing reads into an artificial metagenomic dataset. By doing this, we can be highly confident of the true taxa for each read in the dataset.
We used samples which have matched Illumina and Nanopore sequencing to ensure that differences are purely driven by technological differences and not composition differences. For the human component we combined reads from three individuals in the 1000 Genomes Project with Nanopore data for the human component downloaded from the 1000G ONT Sequencing Consortium (https://millerlaboratory.com/1000G-ONT.html) and we provide URLs for each sample: HG00277 Finnish Male with Illumina NovaSeq 6000 (accession: ERR3241786) and Nanopore R10.4; NA19318 Luhya, Kenya Male with Illumina NovaSeq 6000 (accession: ERR3239713) and Nanopore R10.4 (basecalled with Dorado v0.3.4); HG03611 Bengali, Bangladesh Female with Illumina NovaSeq 6000 (accession: ERR3243073) and Nanopore R10.4 (basecalled with Dorado v0.3.4). Each human readset was randomly downsampled to 1Gbp using rasusa (v0.7.1). For the M. tuberculosis component we used Illumina HiSeq 4000 (accession: ERR245682) and Nanopore R10.3 (accession: ERR8170871) (note: we used R10.3 as there are no R10.4 M. tuberculosis WGS datasets publicly available). For the bacterial component, we used Illumina MiSeq (accession: ERR7255689) and Nanopore R10.4 (accession: ERR7287988) reads from the ZymoBIOMICS HMW DNA Standard D6322 (Zymo Research), which contains seven bacterial and one fungal strain(s) - none of which are Mycobacterium. We removed Nanopore reads from all datasets with a length less than 500bp and the M. tuberculosis and Zymo datasets were downsampled to 3Gbp with rasusa. All human, M. tuberculosis, and Zymo reads were combined into a single artificial metagenomic file.
创建时间:
2024-02-15



