PMC-Patients

arXiv2025-09-30 收录

下载链接：

https://doi.org/10.1038/s41597-023-02814-8

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了用于生成美国医学执照考试（USMLE）风格问题的未识别患者概要。此外，患者概要的平均长度约为419个单词。该数据集的规模较大，包含了多个条目以供问题生成，其中373条来自人工标注集，385条来自由大型语言模型生成的集。任务目标是针对USMLE进行问题生成。

This dataset comprises de-identified patient summaries intended for generating United States Medical Licensing Examination (USMLE)-style questions. The average length of each patient summary is approximately 419 words. This large-scale dataset includes multiple entries for USMLE-style question generation: 373 entries are sourced from a manually annotated dataset, and 385 entries are derived from a corpus generated by large language models. The core task of this dataset is to perform question generation tailored for the USMLE examination.

5,000+

优质数据集

54 个

任务类型

进入经典数据集