Innate antiviral systems are major defensome components that influence prophage distribution in Acinetobacter baumannii
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14366155
下载链接
链接失效反馈官方服务:
资源简介:
In this project, we have analysed the defensome of Acinetobacter baumannii with the aim of profiling different defense systems associated with particular prophage profiles, as well as to predict which systems are more effective and against which specific phages, associating both positively and negatively prophages to defense systems using machine learning techniques.
DOI (Biorxiv): https://doi.org/10.1101/2024.10.26.620419
Python scripts
Package versions: numpy 1.26.4 pandas 2.2.2
binary_matrix.py
Generate a binary matrix of defense systems using genomes without prophages, as input of the Upset plot.
coocurr_matrix.py
Generate a matrix of defense systems coappearance, as input in the fig 2A.
defsys_pres_ann.py
Create a presence-absence matrix of defense systems.
freq_phages_bymlst.py
Get the most frequent prophages (10% of genomes) per MLST (provided in a list).
matrix_mlst_phages_freq.py
Generate two matrix of absolute and relative frequency, respectively, of prophages by frequent MLST group.
pres_aus_matrix_cl.py
Create a presence-absence matrix of prophages.
matrix_preaus_ml.py
Add to the presence-absence matrix of prophages two columns: one with the defense systems of each genome and another with the MLST group to which they belong.
Phylogeny
Use assembly_seq.pl and uniq_sl.pl to build the initial multifasta with only the core genes, as input of MAFFT software. The generated MSA is processed using Clipkit, to eliminate gaps and keep the most informative regions. The processed MSA is used as input to iqTREE to generate the tree.
Circos
Circos were plotted using files generated by prepareForCircos2.pl. This script uses "defsys_presaus_ann.tsv", "logical_viruses.tsv" and a list of genomes of each MLST group to create the input file for the figure. These files are also provided.
README.txt
A more detailed version of the protocol used to generate the results and figures used in the paper.
创建时间:
2025-03-04



