five

Reconstruction of 1,979 prokaryotic metagenome-assembled genomes from 37 global cave environments

收藏
Figshare2025-07-13 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Cave_Metagenome-assembled_Genome/29554673
下载链接
链接失效反馈
官方服务:
资源简介:
This record contains a non-redundant catalog of cave metagenome-assembled genomes (MAGs) reconstructed from 37 caves with all companion tables needed for reuse. Representative genomes are given in two archives split by MIMAG quality; the full set of refined (pre-dereplication) MAGs is provided separately. Summary tables (Excel) cover sample metadata, genome quality, taxonomy, biosynthetic gene clusters (BGCs), antimicrobial-resistance genes (ARGs), functional annotations, and read-recruitment. Two HTML reports document the read quality before and after trimming.Root levelmetadata.xlsx — Sample metadata and sequencing statistics. Each row is a metagenome sample with accessions and context.MAGs_HQ.tar and MAGs_MQ.tar — 1,979 species-level representative MAGs split by qualitymagscot.tar — 4,234 refined MAGs retained from 25,276 candidate bins after bin refinement across three binners.HQ (high quality): ≥90% completeness and ≤5% contamination.MQ (medium quality): ≥50% completeness and ≤10% contamination.All MAG FASTA files are named: [RUN_ACCESSION]_cleanbin_[BIN_ID].faMAGs present in MAGs_HQ/MQ.tar are species-level representatives; MAGs found only in magscot.tar are non-representative refined MAGs.Unpacking archives.tar.zst: Decompress and then untar — Linux/macOS: tar -I unzstd -xf(or zstd -d&& tar -xf); Windows: open with 7-Zip (extract once to get the .tar, then extract the .tar to get the final folder).FoldersQCreport/Two interactive MultiQC HTML reports summarizing read quality:Report_BeforeQC.html — before trimmingReport_AfterQC.html — after trimmingOpen directly in a browser to view adapter content, base-quality profiles, etc.Analysis/The Excel workbooks provide all annotations needed to reuse the catalog.Every table is keyed by MAG_ID, to match the FASTA filename stem in the archives ([RUN_ACCESSION]_cleanbin_[BIN_ID].fa; e.g., ERR10479404_cleanbin_000031), so columns using MAG_ID map directly to the corresponding genome files.CheckM2.xlsx — Genome quality for all 4,234 draft MAGs (located in magscot.tar)Classification.xlsx — GTDB-Tk taxonomy classification for the representatives MAGsBGC.xlsx — antiSMASH 8.0 biosynthetic gene clusters output for the representatives MAGsARG.xlsx — DRAMMA ARG predictions for the representative MAGsDRAM.xlsx — DRAM functional annotation for the representative MAGsCoverM.xlsx — Read-recruitment of each sample against each representative MAGs
创建时间:
2025-07-13
二维码
社区交流群
二维码
科研交流群
商业服务