five

snvphyl_manuscript_synthetic_datasets.tar.gz

收藏
NIAID Data Ecosystem2026-03-09 收录
下载链接:
https://figshare.com/articles/dataset/snvphyl_manuscript_synthetic_datasets_tar_gz/4294838
下载链接
链接失效反馈
官方服务:
资源简介:
This package contains synthetic datasets used to evaluate the the SNVPhyl (http://snvphyl.readthedocs.io/) pipeline.  This is divided into two separate datasets. Additional details on how these datasets were constructed are available at https://github.com/apetkau/snvphyl-validations. 1. e-coli-simulated-dataset: Simulated reads for evaluated SNVPhyl's SNV detection accuracy. Reads are based off of an E. Coli reference genome (NC_002695), plus two plasmids (NC_002128, NC_002127) which were concatenated into a single fasta file reference-genome/e_coli_sakai_w_plasmids.fasta. Random mutations were introduced to produce the variant genomes present in the genomes under variant-genomes/. Reads were simulated using ART Illumina (http://www.niehs.nih.gov/research/resources/software/biostatistics/art/) to generate the fastq files in this directory. 2. salmonella-heidelberg-contamination: Simulated reads for evaluating SNVPhyl's performance in the presence of contamination from another genomic sample. Reads for the sample SH13-001 (BioSample: SAMN04334637) were downsampled and contaminated with SH12-001 (BioSample: SAMN04334627) at percentages of 5%, 10%, 20%, and 30%.
创建时间:
2016-12-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作