five

otter_primekg

收藏
魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/ibm-research/otter_primekg
下载链接
链接失效反馈
官方服务:
资源简介:
# Otter PrimeKG Dataset Card The Otter PrimeKG dataset contains 12,757,257 triples with Proteins, Drugs and Diseases. It contains protein sequences, SMILES and text ## Dataset details #### PrimeKG PrimeKG (the Precision Medicine Knowledge Graph) integrates 20 biomedical resources, it describes 17,080 diseases with 4 million relationships. PrimeKG includes nodes describing Gene/Proteins (29,786) and Drugs (7,957 nodes). The Multimodal Knowledge Graph (MKG) that we built from PrimeKG contains 13 modalities, 12,757,300 edges (154,130 data properties, and 12,603,170 object properties), including 642,150 edges describing interactions between proteins, 25,653 edges describing drug-protein interactions, and 2,672,628 describing interactions between drugs. **Original dataset:** - [GitHub Repo](https://zitniklab.hms.harvard.edu/projects/PrimeKG) - Citation: Chandak, P., Huang, K. & Zitnik, M. Building a knowledge graph to enable precision medicine. Sci Data 10, 67 (2023). https://doi.org/10.1038/s41597-023-01960-3 **Paper or resources for more information:** - [GitHub Repo](https://github.com/IBM/otter-knowledge) - [Paper](https://arxiv.org/abs/2306.12802) **License:** MIT **Where to send questions or comments about the dataset:** - [GitHub Repo](https://github.com/IBM/otter-knowledge) **Models trained on Otter PrimeKG** - [ibm/otter_primekg_classifier](https://huggingface.co/ibm/otter_primekg_classifier) - [ibm/otter_primekg_distmult](https://huggingface.co/ibm/otter_primekg_distmult) - [ibm/otter_primekg_transe](https://huggingface.co/ibm/otter_primekg_transe)

# Otter PrimeKG 数据集卡片 Otter PrimeKG数据集包含12,757,257条涉及蛋白质(Proteins)、药物(Drugs)与疾病(Diseases)的三元组(triples),涵盖蛋白质序列、SMILES表达式与文本数据。 ## 数据集详情 #### PrimeKG PrimeKG(精准医学知识图谱,Precision Medicine Knowledge Graph)整合了20个生物医学资源,可描述17,080种疾病并包含400万条关联关系。该图谱涵盖描述基因/蛋白质的29,786个节点,以及7,957个药物节点。我们基于PrimeKG构建的多模态知识图谱(Multimodal Knowledge Graph,MKG)共包含13种模态、12,757,300条边(其中154,130条为数据属性边,12,603,170条为对象属性边),具体包括642,150条描述蛋白质间相互作用的边、25,653条描述药物-蛋白质相互作用的边,以及2,672,628条描述药物间相互作用的边。 **原始数据集:** - [GitHub仓库](https://zitniklab.hms.harvard.edu/projects/PrimeKG) - 引用文献:Chandak, P., Huang, K. & Zitnik, M. 构建支撑精准医学的知识图谱. 《科学数据(Sci Data)》10, 67 (2023). https://doi.org/10.1038/s41597-023-01960-3 **更多信息参考论文或资源:** - [GitHub仓库](https://github.com/IBM/otter-knowledge) - [研究论文](https://arxiv.org/abs/2306.12802) **许可证:** MIT协议 **数据集相关问题与意见反馈渠道:** - [GitHub仓库](https://github.com/IBM/otter-knowledge) **基于Otter PrimeKG训练的模型:** - [ibm/otter_primekg_classifier](https://huggingface.co/ibm/otter_primekg_classifier) - [ibm/otter_primekg_distmult](https://huggingface.co/ibm/otter_primekg_distmult) - [ibm/otter_primekg_transe](https://huggingface.co/ibm/otter_primekg_transe)
提供机构:
maas
创建时间:
2025-10-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作