AI-Guided Plasticizer Biodegradation Genomics Dataset of Rhizobium pusense KSSKSLAB04
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/AI-Guided_Plasticizer_Biodegradation_Genomics_Dataset_of_Rhizobium_pusense_KSSKSLAB04/31342030
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains the complete analysis dataset associated with the hybrid whole-genome sequencing and functional characterization of Rhizobium pusense strain KSSKSLAB04, a plasticizer-degrading bacterium.
The genome was generated using a hybrid sequencing strategy integrating Oxford Nanopore long-read sequencing and PacBio Onso short-read sequencing, resulting in a high-quality ~5.33 Mb assembly comprising 5,043 protein-coding genes (CDS), 12 rRNA genes, 57 tRNA genes, and 1 tmRNA (total 5,113 annotated genes). This Figshare repository includes:
• Final polished genome assembly (FASTA format)
• Genome annotation files (GFF3, CDS nucleotide sequences, protein FASTA)
• Assembly quality statistics (QUAST, BUSCO, CheckM2 summaries)
• KEGG and KofamScan pathway reconstruction outputs
• eggNOG functional annotation results
• Pan-genome analysis outputs and gene presence/absence matrices
• Mobile genetic element analysis (ISEScan, genomic islands, prophage predictions)
• antiSMASH secondary metabolite biosynthetic gene cluster predictions
• FastANI taxonomic validation results
• Variant calling outputs (VCF and annotated SNP tables)
• Oxygenase and esterase identification datasets
• Machine-learning datasets and analysis outputs for plasticizer hydrolase prediction
Raw sequencing reads are available in NCBI SRA under the associated BioProject. The assembled genome is deposited in NCBI GenBank (Accession: JBUXKN000000000). All custom scripts and bioinformatics pipelines are available on GitHub.
This dataset is provided to ensure full reproducibility, transparency, and reusability of the genomic, comparative, and machine-learning analyses described in the associated manuscript.
创建时间:
2026-02-21



