Xavier1234/Asclepius-Synthetic-Clinical-Notes
收藏Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/Xavier1234/Asclepius-Synthetic-Clinical-Notes
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为Asclepius: Synthetic Clincal Notes & Instruction Dataset,主要用于构建临床大型语言模型。数据集采用临床笔记-问题-答案的格式,包含157k个合成的出院摘要指令-答案对。这些数据是通过GPT-3.5从PMC-Patients病例报告中合成的笔记生成的。数据集支持8种任务,包括命名实体识别、缩写扩展、关系提取、时间信息提取、共指消解、释义、摘要和问答。数据集的语言为英语,结构包含synthetic.csv文件,数据字段包括patient_id(PMC-Patients的唯一病例报告ID)、patient(病例报告文本)、question(GPT-3.5根据患者生成的指令)、answer(GPT-3.5根据给定病例报告和问题生成的答案)和task(问题的对应类别)。
This dataset is named Asclepius: Synthetic Clincal Notes & Instruction Dataset and is primarily used for building clinical large language models. The dataset is composed in a Clinical Note - Question - Answer format, containing 157k synthetic discharge summary instruction-answer pairs. These data were generated by synthesizing notes from PMC-Patients case reports using GPT-3.5. The dataset supports 8 tasks, including Named Entity Recognition, Abbreviation Expansion, Relation Extraction, Temporal Information Extraction, Coreference Resolution, Paraphrasing, Summarization, and Question Answering. The dataset is in English and structured with a synthetic.csv file, with data fields including patient_id (unique case report ID from PMC-Patients), patient (case report text), question (GPT-3.5 generated instruction from patient), answer (GPT-3.5 generated answer for given case report and question), and task (corresponding category of the question).
提供机构:
Xavier1234



