five

Latent Taxonomic Signatures

收藏
Mendeley Data2021-06-29 更新2026-04-09 收录
下载链接:
https://data.mendeley.com/datasets/8tv3dc26vg/1
下载链接
链接失效反馈
官方服务:
资源简介:
The Folder "Supplementary material" contains all supplementary data referenced in manuscript "Latent Taxonomic Signatures: alignment free approach reveals semantic properties of species proteomes", this describes this supplementary material content in the order in which it is being referenced: Supplementary Figure 1.docx – Figure describing LSA language model scheme Supplementary Table 1.xlsx – Excel sheet containing information on taxa included in LSA species model Supplementary Figure 2.docx – Figure displaying protein tokenization scheme, cosine similarity and taxonomy assignation employed in voting scenario method Supplementary Table 2.docx – Table containing download links for FASTA files with “train” and “test” protein sequence sets used in this study. Supplementary Table 3.docx – Table displaying percentage of initial taxa query space as defined by available taxonomy lineage data – used in SBH and VSM method-benchmarking tests Supplementary Table 4.xlsx – Excel sheets containing both relaxed orphan sequence dataset and NCBI Clusters dataset used in this study Supplementary Dataset 1.zip – Zip archive containing FASTA formatted sequences comprising “stringent” orphan dataset from randomly selected species (species NCBI taxId is in the file name) Supplementary Figure 3.docx – Figure displaying schematic overview of protein family taxonomic deconstruction and resulting species vector “intra-class” and “inter-class” comparison Supplementary Table 5.xlsx – Excel containing sequence information from protein family taxonomy based groups used in “selfish” vs “altruistic” mode of evolution experiment
创建时间:
2021-06-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作