Datasets for Lupo et al. (2022) Origin and Evolution of Pseudomurein Biosynthetic Gene Clusters
收藏Figshare2024-11-02 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Datasets_for_Lupo_et_al_2022_Origin_and_Evolution_of_Pseudomurein_Biosynthetic_Gene_Clusters/21641612/2
下载链接
链接失效反馈官方服务:
资源简介:
<pre><code>Lupo et al. 2022 Pseudomurein: Archive content for v2 (Nov 2024)</code></pre><b>Overview</b><pre><code>... 59 directories, 777 files</code></pre>README.md: this file.<code>command-line.sh</code>: examples of bash commands to use or generate the files stored in this archive.10-archaeal-proteomesThis directory contains an archive (<code>.tar.gz</code>) consisting of FASTA (<code>.faa</code>) files of the 10 organisms used by OrthoFinder.archaeal-genomesThis directory contains an archive (<code>.tar.gz</code>) consisting of 819 FASTA (<code>.faa</code>) files corresponding to the archaeal database.bacterial-genomesThis directory contains an archive (<code>.tar.gz</code>) consisting of 598 FASTA (<code>.faa</code>) files corresponding to the bacterial database.configThis directory contains two configuration files (<code>.yaml</code>) used by the <code>classify-ali.pl</code> perl script from Bio::MUST modules:<code>classifier.yaml</code> is the configuration file used to filter the 6,321 orthologous groups (OGs).<code>five_org_classifier.yaml</code> is the configuration file used to filter the OGs after the round of Forty-Two TBLASTN.BLASTPList of configuration files to run Forty-Two BLASTP (FILTERING OF CANDIDATE PROTEINS; see figure S1 in the main manuscript).TBLASTNList of configuration files to run Forty-Two TBLASTN (FILTERING OF CANDIDATE PROTEINS; see figure S1 in the main manuscript).NCBI_CCDList of HMM (<code>.hmm</code>) profile files downloaded from NCBI CDD database and corresponding HMM search (<code>.hmms</code>) files.ompapa-resultsRaw Ompa-Pa results associated with NCBI CDD HMM profiles.ompapaThis directory contains a list of the OGs that have passed the taxonomic filter (see <code>config</code> sections).alignmentsList of alignments in FASTA format of the OGs included in the <code>retained_OGs.lis</code> file (see <code>ompapa</code> section).bacterial_dbResults of ompapa analyses against the bacterial database. This directory contains two sub-directories:<code>hmms</code> that contains a list of HMM search (<code>.hmms</code>) files.<code>ompapa-results</code> that contains raw ompapa results.hmm_profilesList of HMM (<code>.hmm</code>) profiles of the OGs included in the <code>retained_OGs.lis</code> file (see <code>ompapa</code> section).prokaryotic_dbResults of ompapa analyses against the prokaryotic database. This directory contains two sub-directories:<code>hmms</code> that contains a list of HMM search (<code>.hmms</code>) files.<code>ompapa-results</code> that contains raw ompapa results.orthologous-groupsThis directory contains an archive (<code>.tar.gz</code>) consisting of 6,321 FASTA (<code>.fasta</code>) files corresponding to the OGs generated by OrthoFinder.predictionsThis directory contains three sub-directories:<code>interproscan</code> that contains raw InterProScan results.<code>signalp5</code> that contains raw SignalP5 results.<code>tmhmm</code> that contains raw TMHMM results.For the consolidated results, see Table S1 in the main manuscript.prokaryotic-genomesThis directory contains a list of assembly accessions of the 80,490 organisms from the prokaryotic database.regulonThis directory contains all the data needed to reproduce the 'regulon pipeline' (see supplementary data from the main manuscript). All the command lines are included in the <code>command_line_regulon.sh</code> file.scriptsThis directory contains various perl scripts used to generate some of the files stored in this archive (see <code>command-line.sh</code> file).taxdumpMirror of the NCBI Taxonomy used in this study (downloaded on 4th of May 2020).treesThis directory contains three sub-directories:<code>atp-grasp</code><code>mray-family</code><code>mur-family</code>All the sub-directories contains tree (<code>.tre</code>) files presented in the main manuscript and corresponding raw IQ-TREE (<code>.ckp.gz</code>) results. They also contain various files needed to reproduce the phylogenetic analyses (see <code>command-line.sh</code> file).<br>
提供机构:
Kerff, Frédéric; BAURAIN, Denis; Lupo, Valérian
创建时间:
2024-11-02



