Datasets for Lupo et al. (2022) Origin and Evolution of Pseudomurein Biosynthetic Gene Clusters
收藏DataCite Commons2024-11-02 更新2024-07-29 收录
下载链接:
https://figshare.com/articles/dataset/Datasets_for_Lupo_et_al_2022_Origin_and_Evolution_of_Pseudomurein_Biosynthetic_Gene_Clusters/21641612/1
下载链接
链接失效反馈官方服务:
资源简介:
<pre><code>Lupo et al. 2022 Pseudomurein: Archive content for v1</code></pre>
<strong>Overview</strong>
<pre><code>... 61 directories, 769 files</code></pre>
README.md: this file.
<code>command-line.sh</code>: examples of bash commands to use or generate the files stored in this archive.
10-archaeal-proteomes
This directory contains an archive (<code>.tar.gz</code>) consisting of FASTA (<code>.faa</code>) files of the 10 organisms used by OrthoFinder.
alphafold
List of ZIP (<code>.zip</code>) files containing the raw results of AlphaFold predictions and Consurf results.
archaeal-genomes
This directory contains an archive (<code>.tar.gz</code>) consisting of 819 FASTA (<code>.faa</code>) files corresponding to the archaeal database.
bacterial-genomes
This directory contains an archive (<code>.tar.gz</code>) consisting of 598 FASTA (<code>.faa</code>) files corresponding to the bacterial database.
config
This directory contains two configuration files (<code>.yaml</code>) used by the <code>classify-ali.pl</code> perl script from Bio::MUST modules:
<code>classifier.yaml</code> is the configuration file used to filter the 6,321 orthologous groups (OGs).
<code>five_org_classifier.yaml</code> is the configuration file used to filter the OGs after the round of Forty-Two TBLASTN.
BLASTP
List of configuration files to run Forty-Two BLASTP (FILTERING OF CANDIDATE PROTEINS; see figure S1 in the main manuscript).
TBLASTN
List of configuration files to run Forty-Two TBLASTN (FILTERING OF CANDIDATE PROTEINS; see figure S1 in the main manuscript).
NCBI_CCD
List of HMM (<code>.hmm</code>) profile files downloaded from NCBI CDD database and corresponding HMM search (<code>.hmms</code>) files.
ompapa-results
Raw Ompa-Pa results associated with NCBI CDD HMM profiles.
ompapa
This directory contains a list of the OGs that have passed the taxonomic filter (see <code>config</code> sections).
alignments
List of alignments in FASTA format of the OGs included in the <code>retained_OGs.lis</code> file (see <code>ompapa</code> section).
bacterial_db
Results of ompapa analyses against the bacterial database. This directory contains two sub-directories:
<code>hmms</code> that contains a list of HMM search (<code>.hmms</code>) files.
<code>ompapa-results</code> that contains raw ompapa results.
hmm_profiles
List of HMM (<code>.hmm</code>) profiles of the OGs included in the <code>retained_OGs.lis</code> file (see <code>ompapa</code> section).
prokaryotic_db
Results of ompapa analyses against the prokaryotic database. This directory contains two sub-directories:
<code>hmms</code> that contains a list of HMM search (<code>.hmms</code>) files.
<code>ompapa-results</code> that contains raw ompapa results.
orthologous-groups
This directory contains an archive (<code>.tar.gz</code>) consisting of 6,321 FASTA (<code>.fasta</code>) files corresponding to the OGs generated by OrthoFinder.
predictions
This directory contains three sub-directories:
<code>interproscan</code> that contains raw InterProScan results.
<code>signalp5</code> that contains raw SignalP5 results.
<code>tmhmm</code> that contains raw TMHMM results.
For the consolidated results, see Table S1 in the main manuscript.
prokaryotic-genomes
This directory contains a list of assembly accessions of the 80,490 organisms from the prokaryotic database.
regulon
This directory contains all the data needed to reproduce the ‘regulon pipeline’ (see supplementary data from the main manuscript). All the command lines are included in the <code>command_line_regulon.sh</code> file.
scripts
This directory contains various perl scripts used to generate some of the files stored in this archive (see <code>command-line.sh</code> file).
taxdump
Mirror of the NCBI Taxonomy used in this study (downloaded on 4th of May 2020).
trees
This directory contains three sub-directories:
<code>atp-grasp</code>
<code>mray-family</code>
<code>mur-family</code>
All the sub-directories contains tree (<code>.tre</code>) files presented in the main manuscript and corresponding raw IQ-TREE (<code>.ckp.gz</code>) results. They also contain various files needed to reproduce the phylogenetic analyses (see <code>command-line.sh</code> file).
<br>
提供机构:
figshare
创建时间:
2022-11-30



