snvphyl_manuscript_synthetic_datasets.tar.gz
收藏NIAID Data Ecosystem2026-03-09 收录
下载链接:
https://figshare.com/articles/dataset/snvphyl_manuscript_synthetic_datasets_tar_gz/4294838
下载链接
链接失效反馈官方服务:
资源简介:
This package contains synthetic datasets used to evaluate the the
SNVPhyl (http://snvphyl.readthedocs.io/) pipeline. This is divided into
two separate datasets. Additional details on how these datasets were constructed are available at https://github.com/apetkau/snvphyl-validations.
1. e-coli-simulated-dataset: Simulated reads for evaluated SNVPhyl's SNV detection accuracy.
Reads
are based off of an E. Coli reference genome (NC_002695), plus two
plasmids (NC_002128, NC_002127) which were concatenated into a single
fasta file reference-genome/e_coli_sakai_w_plasmids.fasta. Random mutations were introduced to produce the variant genomes present in the genomes under variant-genomes/.
Reads were simulated using ART Illumina
(http://www.niehs.nih.gov/research/resources/software/biostatistics/art/)
to generate the fastq files in this directory.
2. salmonella-heidelberg-contamination: Simulated reads for evaluating SNVPhyl's performance in the presence of contamination from another genomic sample.
Reads
for the sample SH13-001 (BioSample: SAMN04334637) were downsampled and
contaminated with SH12-001 (BioSample: SAMN04334627) at percentages of
5%, 10%, 20%, and 30%.
创建时间:
2016-12-09



