five

M. musculus predictions

收藏
DataCite Commons2020-08-29 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/M_musculus_predictions/6633431/1
下载链接
链接失效反馈
官方服务:
资源简介:
About this dataset: These are the most reliable 200,000 PPI predictions for M. musculus. It is a CSV file.<br>Motivation: Protein-protein interactions (PPIs) play a key role in many cellular processes. Most annotations of PPIs mix experimental and computational data. The mix optimizes coverage, but obfuscates the annotation origin. Some resources excel at focusing on reliable experimental data. Here, we focused on new pairs of interacting proteins for several model organisms based solely on sequence-based prediction methods. <br>Results: We extracted reliable experimental data about which proteins interact (binary) for eight diverse model organisms from public databases, namely from Escherichia coli, Schizosaccharomyces pombe, Plasmodium falciparum, Drosophila melanogaster, Caenorhabditis elegans, Mus musculus, Rattus norvegicus, Arabidopsis thaliana, and for the previously used Homo sapiens and Saccharomyces cerevisiae. Those data were the base to develop a PPI prediction method for each model organism. The method used evolutionary information through a profile-kernel Support Vector Machine (SVM). With the resulting eight models, we predicted all possible protein pairs in each organism and made the top predictions available through a web application. Almost all of the PPIs made available were predicted between proteins that have not been observed in any interaction, in particular for less well-studied organisms. Thus, our work complements existing resources and is particularly helpful for designing experiments because of its uniqueness. Experimental annotations and computational predictions are strongly influenced by the fact that some proteins have many partners and others few. To optimize machine learning, the newly methods explicitly ignored such a network-structure. This might be another strength of our approach. The database interface representing our results is accessible from https://rostlab.org/services/ppipair/.<br>Please cite us when you are using this data:<pre>@article{tran2018profppidb, title={ProfPPIdb: pairs of physical protein-protein interactions predicted for entire proteomes}, author={Tran, Linh and Hamp, Tobias and Rost, Burkhard}, journal={bioRxiv}, pages={332510}, year={2018}, publisher={Cold Spring Harbor Laboratory} }</pre>

本数据集说明:本数据集包含针对小家鼠(Mus musculus)的20万条可信度最高的蛋白质相互作用(Protein-protein interaction, PPI)预测结果,文件格式为逗号分隔值(Comma-Separated Values, CSV)。<br>研究背景:蛋白质相互作用(PPI)在诸多细胞生命过程中发挥关键作用。当前多数PPI注释数据同时混杂实验与计算预测结果,虽可提升数据覆盖范围,却会模糊注释来源。部分资源仅聚焦于可靠的实验数据。本研究仅基于序列预测方法,针对多种模式生物筛选得到全新的蛋白质相互作用蛋白对。<br>研究结果:我们从公共数据库中提取了8种不同模式生物的可靠二元蛋白质相互作用实验数据,分别为大肠杆菌(Escherichia coli)、粟酒裂殖酵母(Schizosaccharomyces pombe)、恶性疟原虫(Plasmodium falciparum)、黑腹果蝇(Drosophila melanogaster)、秀丽隐杆线虫(Caenorhabditis elegans)、小家鼠(Mus musculus)、褐家鼠(Rattus norvegicus)、拟南芥(Arabidopsis thaliana),以及此前已被使用的智人(Homo sapiens)和酿酒酵母(Saccharomyces cerevisiae)。以此数据为基础,我们为每种模式生物开发了一套PPI预测方法,该方法通过谱核支持向量机(profile-kernel Support Vector Machine, SVM)利用进化信息完成预测。借助生成的8个预测模型,我们对每种生物内所有可能的蛋白质对进行了预测,并通过网页应用程序开放了排名靠前的预测结果。本次公开的绝大多数PPI预测结果均来自此前未被观测到存在相互作用的蛋白质之间,针对研究较少的模式生物尤为显著。因此,本研究可对现有相关资源形成有效补充,且由于其数据的独特性,可为实验设计提供重要参考。现有实验注释与计算预测方法常受蛋白质互作伙伴数量差异的影响——部分蛋白质拥有大量互作伴侣,而部分则仅有少数。为优化机器学习模型训练效果,本研究开发的新方法未显式考虑蛋白质互作网络的结构特性,这或许也是本方法的另一优势。本研究结果对应的数据库界面可通过https://rostlab.org/services/ppipair/访问。<br>使用本数据集时请引用如下文献:<pre>@article{tran2018profppidb, title={ProfPPIdb: pairs of physical protein-protein interactions predicted for entire proteomes}, author={Tran, Linh and Hamp, Tobias and Rost, Burkhard}, journal={bioRxiv}, pages={332510}, year={2018}, publisher={Cold Spring Harbor Laboratory} }</pre>
提供机构:
figshare
创建时间:
2018-06-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作