five

[BIOCOM-PIPE] Artificial dataset from ZymoBIOMICS

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/3947842
下载链接
链接失效反馈
官方服务:
资源简介:
This artificial dataset is a microbial community from ZymoBIOMICS. More precisely, it is a mock microbial community consisting of eight bacterial and two fungal strains (see PDF for its specific composition). It includes three easy-to-lyse Gram-negative bacteria (e.g. Escherichia coli), five tough-to-lyse Gram-positive bacteria (e.g. Listeria monocytogenes), and two tough-to-lyse yeasts (e.g. Cryptococcus neoformans). The 16S/18S rRNA sequences (FASTA format) and genomes (FASTA format) of these strains are available at: https://s3.amazonaws.com/zymo-files/BioPool/ZymoBIOMICS.STD.refseq.v2.zip. A 16S rRNA gene fragment targeting the V3-V4 regions to characterize bacterial diversity was amplified using the primers F479 (5’-CAGCMGCYGCNGTAANAC-3’) and R888 (5’-CCGYCAATTCMTTTRAGT-3’) using the expertise and protocols from the GenoSol platform. The 16S PCR products were then purified and quantified using the QuantiFluor staining kit (Promega, USA). A second PCR of 7 cycles was then duplicated for each sample under similar PCR conditions, with purified PCR products as matrix (7.5ng of DNA were used for a 25μl mix of PCR) and dedicated fusion primers (‘F479/MID’, ‘R888/MID/’) integrating multiplex identifiers at 5’ extremities. All duplicated PCR products were then pooled, purified and quantified using the QuantiFluor staining kit (Promega, USA). Samples were pooled, and then cleaned to remove excess nucleotides, salts and enzymes using the Agencourt AMPure XP system (Beckman Coulter Genomics). The V3-V4 regions of the 16S rRNA genes generated using Illumina MiSeq technology (4 and 3 replicates from two independent runs) were then obtained. Pair-end reads were then trimmed (Q30 for 3' end) with PRINSeq, and assembled with FLASH (a minimum of 10 and a maximum of 100 bases overlapping was required with a minimum of 96% of homology between reads). The seven samples came from two independent RUNs of Illumina MiSeq (RUN01 and RUN02), one encompassing four replicates, and another three replicates, each one from independent PCR reactions.
创建时间:
2024-07-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作