Medical Abstracts
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/sebischair/medical-abstracts-tc-corpus
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了28,880篇医学摘要,描述了5种不同类别的病人状况。大约只有一半的数据集被标注。经过处理的 数据集添加了描述性标签,并将数据分为训练集和测试集。该数据集可在遵循知识共享署名-相同方式共享3.0许可协议下使用。规模上,数据集分为5个类别,共计28,880个样本。任务类型为文本分类。
This dataset contains 28,880 medical abstracts that describe patient conditions across 5 distinct categories. Only approximately half of the dataset is annotated. The processed dataset has been assigned descriptive labels and split into training and test subsets. This dataset is available under the Creative Commons Attribution-ShareAlike 3.0 License. In terms of scale, the dataset consists of 28,880 samples across 5 categories. The task type is text classification.
提供机构:
Kaggle



