five

snvphyl_manuscript_synthetic_datasets.tar.gz

收藏
DataCite Commons2020-09-03 更新2024-07-25 收录
下载链接:
https://figshare.com/articles/dataset/snvphyl_manuscript_synthetic_datasets_tar_gz/4294838
下载链接
链接失效反馈
官方服务:
资源简介:
This package contains synthetic datasets used to evaluate the the SNVPhyl (http://snvphyl.readthedocs.io/) pipeline. This is divided into two separate datasets. Additional details on how these datasets were constructed are available at https://github.com/apetkau/snvphyl-validations.<br><br>1. <b>e-coli-simulated-dataset</b>: Simulated reads for evaluated SNVPhyl's SNV detection accuracy.<br><br>Reads are based off of an E. Coli reference genome (NC_002695), plus two plasmids (NC_002128, NC_002127) which were concatenated into a single fasta file <i>reference-genome/e_coli_sakai_w_plasmids.fasta</i>. Random mutations were introduced to produce the variant genomes present in the genomes under <i>variant-genomes/</i>. Reads were simulated using ART Illumina (http://www.niehs.nih.gov/research/resources/software/biostatistics/art/) to generate the fastq files in this directory.<br><br>2. <b>salmonella-heidelberg-contamination</b>: Simulated reads for evaluating SNVPhyl's performance in the presence of contamination from another genomic sample.<br><br>Reads for the sample SH13-001 (BioSample: SAMN04334637) were downsampled and contaminated with SH12-001 (BioSample: SAMN04334627) at percentages of 5%, 10%, 20%, and 30%.<br>
提供机构:
figshare
创建时间:
2016-12-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作