Data accompanying the manuscript: Capsular specificity in temperate phages of Klebsiella pneumoniae is driven by diverse receptor-binding enzymes
收藏Figshare2026-03-12 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Data_accompanying_the_manuscript_i_Capsular_specificity_in_temperate_phages_of_i_i_Klebsiella_pneumoniae_i_i_i_i_is_driven_by_diverse_receptor-binding_enzymes_i_/29181188
下载链接
链接失效反馈官方服务:
资源简介:
Accompanying studyThis dataset accompanies the manuscript "Capsular specificity in temperate phages of Klebsiella pneumoniae is driven by diverse receptor-binding enzymes", and complements the Supplementary Data published alongside the manuscript. The relevant code can be found under the following DOI:https://doi.org/10.5281/zenodo.18682705Data descriptionInputThe first part contains all input data into the project: annotated genome assemblies, associated metadata for the bacterial isolates analyzed in the study as well as the annotated prophage sequences identified from those isolates.The corresponding compressed files included here, as obtained and generated using the workflow described in the manuscript, are:INPUT_DATA/BACTERIA_FASTA.tar.xz: Three sets of FASTA files (3,911 files per set) corresponding to the bacterial isolatesWhole genome nucleotide sequencesCoding DNA sequences (CDS)Predicted protein sequences (AA)INPUT_DATA/BACTERIA_GENBANK.tar.xz: 3,911 corresponding GenBank files with annotationINPUT_DATA/BACTERIA_METADATA.tar.xz:contigs.tsv: List and lengths of all contigs per isolatebacteria_iqtree.nwk: Phylogenetic tree of the bacterial isolates in Newick format (computed with IQ-TREE)bacteria_metadata.tsv: Summary metadata table for the isolatesINPUT_DATA/PROPHAGES_FASTA.tar.xz: Three sets of FASTA files (8,105 files per set) corresponding to the detected prophages from the bacterial isolatesWhole genome nucleotide sequencesCoding DNA sequences (CDS)Predicted protein sequences (AA)INPUT_DATA/PROPHAGES_GENBANK.tar.xz: 8,105 corresponding GenBank files with annotationINPUT_DATA/PROPHAGES_METADATA.tar.xz:pcs2proteins.tsv: List of all phages proteins grouped in Protein Clusters and representative protein of the clusterraw_hhsuite.tsv: Protein clusters functional annotation table using HH-suiteprophages_metadata.tsv: Summary metadata table for the 8,105 prophagesOutputThe second part contains key output data from the data processing pipeline which was used to generate all of the computational results in the paper, including extra analyses carried out during peer review.The corresponding file is a single, compressed TAR:OUTPUT_DATA/GWAS.tar.xz: Full output folder, with three subfolders:1_INTERMEDIATE: functional annotations of prophage proteins, GWAS inputs3_PROCESSING: GWAS results obtained at different clustering levels4_ANALYZE: aggregated data for the 35 K-loci used in the study and individual results (sequences) for each K-locusOUTPUT_DATA/REVIEW.tar.xz: Full output folder, with two subfolders:DEGRADED_CRYPTIC_PROPHAGES: Output of GWAS analyses based on a subset of high-quality prophages only (with >=99% completeness).STS_GEODISTRIBUTION: Datasets underlying comparative analysis of STs from the GWAS, KlebNNSsero and Pathogenwatch datasets.OtherThe last part contains the folder with other metadata used to reproduce the results which are not included in the supporting information alongside the manuscript:OTHER/ecod.develop263.F70.domains.txt: Metadata of ECODs used in the paper from release 20200207 (develop263), accessible at http://prodata.swmed.edu/ecod but published here for convenience.
创建时间:
2026-03-12



