Evolution of venom production in marine predatory snails
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13685165
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains the following datasets:
annotations.zip: includes two files per species. The "species_code_annot.tsv" file (e.g., CI_annot.tsv) includes the blast hits to Uniprot/SwissProt (_sp), gastropod genomes (_gastr), Cdd (_cdd), Pfam (_pfam), ToxProt (_tox), and Conoserver (_cono); the output of SignalP and of ConoPrec. The "species_code_topGO.tsv" file (e.g., CI_topGO.tsv) corresponds the GO annotations in a format compatible with TopGO.
assemblies.zip: includes the nucleotide coding sequences as "species_code_cds.fasta" (e.g., CI_cds.fasta) and the predicted amino acid sequences as "species_code.faa" (e.g., CI.faa) in fasta format.
expression_matrices.zip: includes the final, quality-filtered expression matrices in TPM for each species separately as "species_code_tpm.tsv" (e.g., CI_tpm.tsv). The multi-species expression matrices based on the random-selection method ("multispecies_tpm_random.tsv") and mean method ("multispecies_tpm_mean.tsv") are provided.
tissue_specific_gene_sets.zip: includes the list of tissue-specific genes for each species as "species_code_fc2_tb.tsv" (e.g., CI_fc2_tb.tsv). The column "tissue1" corresponds to the tissue with the highest TPM value ("tpm1"), therefore the tissue to which that gene is specific, while "tissue2" correspond to the tissue with the second-highest TPM values ("tpm2"). "FC" is the fold-change.
orthologer_output: includes the orthogroup assigment as outputted by the software OrthoLoger. Specifically, "path2proteome_orthogroups.txt" corresponds to the orthogroup assignment for all genes assigned to an orthogroup, while "path2proteome_stats.txt" list some statistic parameters (e.g., size of the orthogroups etc.).
CAGEE_output: includes the output from the software CAGEE as reported in the manual (https://github.com/hahnlab/CAGEE/blob/main/docs/manual/cagee_manual.md#Installation). The results for both, a gene expression matrix based on the random-selection method and the mean-based method are reported in separate folders. Within each folder are reported the results for the gland, salivary glands, and oesophagus separately. Additionally, the ultrametric species tree and the sigma tree, both in in Newick format, are provided.
创建时间:
2024-09-09



