ChloroScan Supplementary files
收藏DataCite Commons2025-08-24 更新2026-05-07 收录
下载链接:
https://figshare.unimelb.edu.au/articles/dataset/ChloroScan_Supplementary_files/28722788/5
下载链接
链接失效反馈官方服务:
资源简介:
<b>Supplementary files:</b>ChloroScan outputs for 4 real metagenomes and 3 benchmarking metagenomes;marker gene database and the files required during creating it;Outputs from orthofinder showing orthogroup grouping and the lists of all genomes selected.<b>Description of the workflow of marker gene database construction. </b>458 genomes were selected from NCBI GenBank to assess the phylogeny of algae plastid genomes (GenBank id can be found in marker_gene_database_files.tar.gz).Genes are firstly clustered into orthogroups. Then we select 22 conserved genes' (inferred from gene content comparisons) orthogroups and deduplicate the inparalogs which are identical to each other.We then align them using mafft and trim their alignments using clipkit. Finally the alignments are concatenated via Phykit. IQTree2 will take the concatenated supermatrix and infer a maximum likelihood tree.After generating the tree, we decorate the tree with marker genes for each node, and construct marker sets by colocalizing them based on CheckM's rationale. Finally, we pick the lineages with reasonable monophyly and save their marker gene sets to the database. When running ChloroScan, the binning module will use the profile hidden markov models (HMM files) and these selected marker sets to guide binning and assess final bins' qualities: completeness and purity.
提供机构:
The University of Melbourne
创建时间:
2025-08-22



