snvphyl_manuscript_synthetic_datasets.tar.gz
收藏DataCite Commons2020-09-03 更新2024-07-25 收录
下载链接:
https://figshare.com/articles/dataset/snvphyl_manuscript_synthetic_datasets_tar_gz/4294838
下载链接
链接失效反馈官方服务:
资源简介:
This package contains synthetic datasets used to evaluate the the SNVPhyl (http://snvphyl.readthedocs.io/) pipeline. This is divided into two separate datasets. Additional details on how these datasets were constructed are available at https://github.com/apetkau/snvphyl-validations.<br><br>1. <b>e-coli-simulated-dataset</b>: Simulated reads for evaluated SNVPhyl's SNV detection accuracy.<br><br>Reads are based off of an E. Coli reference genome (NC_002695), plus two plasmids (NC_002128, NC_002127) which were concatenated into a single fasta file <i>reference-genome/e_coli_sakai_w_plasmids.fasta</i>. Random mutations were introduced to produce the variant genomes present in the genomes under <i>variant-genomes/</i>. Reads were simulated using ART Illumina (http://www.niehs.nih.gov/research/resources/software/biostatistics/art/) to generate the fastq files in this directory.<br><br>2. <b>salmonella-heidelberg-contamination</b>: Simulated reads for evaluating SNVPhyl's performance in the presence of contamination from another genomic sample.<br><br>Reads for the sample SH13-001 (BioSample: SAMN04334637) were downsampled and contaminated with SH12-001 (BioSample: SAMN04334627) at percentages of 5%, 10%, 20%, and 30%.<br>
提供机构:
figshare
创建时间:
2016-12-08



