five

Evolution of venom production in marine predatory snails

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13685165
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains the following datasets: annotations.zip: includes two files per species. The "species_code_annot.tsv" file (e.g., CI_annot.tsv) includes the blast hits to Uniprot/SwissProt (_sp), gastropod genomes (_gastr), Cdd (_cdd), Pfam (_pfam), ToxProt (_tox), and Conoserver (_cono); the output of SignalP and of ConoPrec. The "species_code_topGO.tsv" file (e.g., CI_topGO.tsv) corresponds the GO annotations in a format compatible with TopGO. assemblies.zip: includes the nucleotide coding sequences as "species_code_cds.fasta" (e.g., CI_cds.fasta) and the predicted amino acid sequences as "species_code.faa" (e.g., CI.faa) in fasta format. expression_matrices.zip: includes the final, quality-filtered expression matrices in TPM for each species separately as "species_code_tpm.tsv" (e.g., CI_tpm.tsv). The multi-species expression matrices based on the random-selection method ("multispecies_tpm_random.tsv") and mean method ("multispecies_tpm_mean.tsv") are provided. tissue_specific_gene_sets.zip: includes the list of tissue-specific genes for each species as "species_code_fc2_tb.tsv" (e.g., CI_fc2_tb.tsv). The column "tissue1" corresponds to the tissue with the highest TPM value ("tpm1"), therefore the tissue to which that gene is specific, while "tissue2" correspond to the tissue with the second-highest TPM values ("tpm2"). "FC" is the fold-change. orthologer_output: includes the orthogroup assigment as outputted by the software OrthoLoger. Specifically, "path2proteome_orthogroups.txt" corresponds to the orthogroup assignment for all genes assigned to an orthogroup, while "path2proteome_stats.txt" list some statistic parameters (e.g., size of the orthogroups etc.). CAGEE_output: includes the output from the software CAGEE as reported in the manual (https://github.com/hahnlab/CAGEE/blob/main/docs/manual/cagee_manual.md#Installation). The results for both, a gene expression matrix based on the random-selection method and the mean-based method are reported in separate folders. Within each folder are reported the results for the gland, salivary glands, and oesophagus separately. Additionally, the ultrametric species tree and the sigma tree, both in in Newick format, are provided.
创建时间:
2024-09-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作