vaibhavalakshmiravideshik/emapa-uberon-4k
收藏Hugging Face2026-05-29 更新2026-05-31 收录
下载链接:
https://hf-mirror.com/datasets/vaibhavalakshmiravideshik/emapa-uberon-4k
下载链接
链接失效反馈官方服务:
资源简介:
Uberon-EMAPA Entity Alignment 4K 是一个用于异构知识图谱的生物医学实体对齐基准数据集。该数据集对齐了 Uber-anatomy Ontology (Uberon) 和 Edinburgh Mouse Atlas Project Anatomy ontology (EMAPA) 两个解剖学本体之间的解剖实体,使用Uberon中直接嵌入的经过人工整理的交叉引用作为对齐依据。数据集包含4,079个经过验证的实体对齐对,旨在支持真实的实体对齐研究,而非简化的标签匹配。该基准将黄金对齐嵌入完整的Uberon和EMAPA本体图谱中,因此模型必须在存在物种不匹配、时间与非时间表示不匹配、模式不对称以及大量非黄金背景结构的情况下恢复对齐的解剖概念。数据集包括实体对齐文件(ent_links)、关系三元组文件(rel_triples_1和rel_triples_2)和属性三元组文件(attr_triples_1和attr_triples_2),总计约200,943行数据,27.37 MB。Uberon图谱包含14,971个实体,42,260个关系三元组和115,702个属性三元组;EMAPA图谱包含8,106个实体,28,299个关系三元组和9,603个属性三元组。数据集的异质性体现在模式不匹配、时间与非时间不匹配、物种不匹配、属性不对称以及背景比率高等方面。
Uberon-EMAPA Entity Alignment 4K is a biomedical entity alignment benchmark for heterogeneous knowledge graphs. It aligns anatomical entities between the Uber-anatomy Ontology (Uberon) and the Edinburgh Mouse Atlas Project Anatomy ontology (EMAPA) using curated cross-references embedded directly in Uberon. The benchmark contains 4,079 validated entity alignment pairs and is designed for realistic entity alignment research rather than simplified label matching. The gold alignment is embedded in the full Uberon and EMAPA ontology graphs, so models must recover aligned anatomical concepts in the presence of species mismatch, temporal-vs-atemporal representation mismatch, schema asymmetry, and substantial non-gold background structure. The dataset includes entity alignment files (ent_links), relation triple files (rel_triples_1 and rel_triples_2), and attribute triple files (attr_triples_1 and attr_triples_2), totaling approximately 200,943 lines and 27.37 MB. The Uberon graph contains 14,971 entities, 42,260 relation triples, and 115,702 attribute triples; the EMAPA graph contains 8,106 entities, 28,299 relation triples, and 9,603 attribute triples. The heterogeneity of the dataset is reflected in schema mismatch, temporal vs atemporal mismatch, species mismatch, attribute asymmetry, and high background ratio.
提供机构:
vaibhavalakshmiravideshik



