five

OBO-syn

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/HanwenXuTHU/GraphPrompt
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集名为OBO-syn,是一个由专家精心策划的生物医学领域数据集,包含70种不同类型的概念,以及200万个精心筛选的概念-术语对,用于评估生物医学领域的同义词预测方法。该数据集的独特之处在于实体之间具有独特的图结构,实体及其同义词之间存在着语义关系,为评估实体标准化方法提供了新的基准。这个大规模的数据集包含了70个关系图和200万对数据,其任务聚焦于生物医学同义词预测。

The dataset named OBO-syn is a meticulously curated biomedical dataset. It encompasses 70 distinct concept types and 2 million carefully screened concept-term pairs, which are designed for evaluating synonym prediction methods in the biomedical domain. The uniqueness of this dataset lies in its distinctive graph structure between entities, where semantic relationships exist between entities and their synonyms, providing a novel benchmark for evaluating entity normalization methods. This large-scale dataset includes 70 relational graphs and 2 million data pairs, with its core task focusing on biomedical synonym prediction.
提供机构:
OBO
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作