five

PMC-Patients

收藏
arXiv2025-09-30 收录
下载链接:
https://doi.org/10.1038/s41597-023-02814-8
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含了用于生成美国医学执照考试(USMLE)风格问题的未识别患者概要。此外,患者概要的平均长度约为419个单词。该数据集的规模较大,包含了多个条目以供问题生成,其中373条来自人工标注集,385条来自由大型语言模型生成的集。任务目标是针对USMLE进行问题生成。

This dataset comprises de-identified patient summaries intended for generating United States Medical Licensing Examination (USMLE)-style questions. The average length of each patient summary is approximately 419 words. This large-scale dataset includes multiple entries for USMLE-style question generation: 373 entries are sourced from a manually annotated dataset, and 385 entries are derived from a corpus generated by large language models. The core task of this dataset is to perform question generation tailored for the USMLE examination.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作