medexanon/Medex
收藏Hugging Face2025-07-30 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/medexanon/Medex
下载链接
链接失效反馈官方服务:
资源简介:
Medex数据集是包含从小分子和基因/蛋白质的大量PubMed文章中提取的事实的数据集。每个事实都附带有一个关联的小分子和基因/蛋白质的标识符。对于小分子,标识符是SMILES字符串;对于基因/蛋白质,标识符是NCBI基因ID。数据集还包含了关于提取事实的论文的出版信息(期刊名称、ISSN和eISSN),以便进行粗粒度的筛选。
The Medex dataset contains facts about small molecules and genes/proteins extracted from a large number of PubMed articles. Each fact is accompanied by an associated identifier for small molecules and genes/proteins. For small molecules, this is the SMILES string, and for genes/proteins it is the NCBI Gene ID. The dataset also includes publication information about the papers where the fact was retrieved from (journal name, ISSN, and eISSN) for coarse grained filtering.
提供机构:
medexanon



