intermediate_output_data_files_effect_verification_of_deep_knowledge_representation
收藏DataCite Commons2021-02-28 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/intermediate_output_data_files_effect_verification_of_deep_knowledge_representation/14107331/1
下载链接
链接失效反馈官方服务:
资源简介:
These intermediate files are used to visually evaluate the effect of the representation learning models.<br><br>Visualization distribution of corresponding categorical attributes of the five protein biological properties in trainingspace. (a) Visualization distribution of categorical attributes of the MeSH concepts in literature-space. Each point represents aMeSH concept and the colors indicate the MeSH tree categories. A total of 35 first-level MeSH tree categories and 16,035 MeSHconcepts are included. (b) Visualization distribution of categorical attributes of the GO terms in literature-space. Each pointrepresents a GO term and the colors indicate the sub-ontology categories. A total of 3 sub-ontology categories (biological process,molecular function, and cellular component) and 29,306 GO terms are included. (c) Visualization distributions of physiochemicalproperties of amino acids in sequence-space. Each point represents a 3-gram sub-sequence and the colors indicate the scalefor each physiochemical property. A total of 9 physiochemical properties and 10,617 3-gram sub-sequences are included. (d)Visualization distributions of network edge weight properties in PPI-space. Each point represents a protein node and the colorsindicate the scale for each sub-score or combined score. A total of 5 sub-scores and a combined sore, and 33,531 protein nodeswere included. (e) Visualization distribution of chromosome number properties in gene expression-space. Each point representsthe genotype-tissue expression data of each gene and the colors indicate the chromosome numbers. A total of 23 chromosomenumbers and 20,131 genotype-tissue expression data were included. As it is shown data points with the same categorical attributeare automatically clustered together, signifying that corresponding categorical attributes are smoothly distributed in training space.SASA solvent-accessible surface, NCI net change index of side chains, MASS average mass of amino acid.
提供机构:
figshare
创建时间:
2021-02-28



