Table4_Identifying Functions of Proteins in Mice With Functional Embedding Features.XLSX
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/Table4_Identifying_Functions_of_Proteins_in_Mice_With_Functional_Embedding_Features_XLSX/19769944
下载链接
链接失效反馈官方服务:
资源简介:
In current biology, exploring the biological functions of proteins is important. Given the large number of proteins in some organisms, exploring their functions one by one through traditional experiments is impossible. Therefore, developing quick and reliable methods for identifying protein functions is necessary. Considerable accumulation of protein knowledge and recent developments on computer science provide an alternative way to complete this task, that is, designing computational methods. Several efforts have been made in this field. Most previous methods have adopted the protein sequence features or directly used the linkage from a protein–protein interaction (PPI) network. In this study, we proposed some novel multi-label classifiers, which adopted new embedding features to represent proteins. These features were derived from functional domains and a PPI network via word embedding and network embedding, respectively. The minimum redundancy maximum relevance method was used to assess the features, generating a feature list. Incremental feature selection, incorporating RAndom k-labELsets to construct multi-label classifiers, used such list to construct two optimum classifiers, corresponding to two key measurements: accuracy and exact match. These two classifiers had good performance, and they were superior to classifiers that used features extracted by traditional methods.
当前生物学领域中,探究蛋白质的生物学功能具有重要意义。鉴于部分生物体的蛋白质数量庞大,通过传统实验逐一探究其功能几乎无法实现。因此,开发快速可靠的蛋白质功能识别方法极具必要性。蛋白质相关知识的大量积累与计算机科学领域的新近进展,为完成该任务提供了可行的替代方案——即设计计算方法。该领域已开展诸多相关研究尝试。此前多数方法或采用蛋白质序列特征,或直接使用蛋白质-蛋白质相互作用(protein–protein interaction, PPI)网络的关联信息。本研究提出了若干新颖的多标签分类器,其采用新型嵌入特征表征蛋白质。这些特征分别源自功能结构域与PPI网络,具体通过词嵌入与网络嵌入方法提取。研究采用最小冗余最大相关方法对特征进行评估,生成特征列表。随后结合随机k标签集(Random k-labELsets)方法开展增量特征选择,利用该特征列表构建了两款最优分类器,分别对应两项关键评价指标:准确率与精确匹配率。这两款分类器性能优异,且优于采用传统方法提取特征的分类器。
创建时间:
2022-05-16



