Medical Concepts from Snomed CT
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/cambridgeltl/SIPHS
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了21,000个医疗概念,每个概念都与一段由多个单词组成的文本描述相关联。该数据集适用于文本到实体映射任务。此外,数据集被划分为训练集(14,754个实例)、测试集(4,187个实例)和开发集(2,000个实例)。其纳入标准是至少包含一段由4个或更多单词组成的文本描述。这是一个大规模的数据集,拥有21,000个实例,其任务是进行文本到实体的映射。
This dataset contains 21,000 medical concepts, each associated with a multi-word textual description. It is suitable for text-to-entity mapping tasks. Furthermore, the dataset is divided into three subsets: a training set with 14,754 instances, a test set with 4,187 instances, and a development set with 2,000 instances. The inclusion criterion for this dataset is that each entry must include at least one textual description consisting of 4 or more words. This is a large-scale dataset with 21,000 instances, designed for text-to-entity mapping tasks.
提供机构:
Snomed CT



