MedLane
收藏arXiv2023-09-22 更新2024-06-21 收录
下载链接:
https://github.com/machinelearning4health/MedLane
下载链接
链接失效反馈官方服务:
资源简介:
MedLane数据集是由宾夕法尼亚州立大学信息科学与技术学院创建,专注于临床语言简化任务。该数据集包含12801条训练样本,旨在支持自动化临床语言简化方法的开发与评估。数据集不仅提供对齐的句子对,还提供术语级注释,确保从专业临床术语到普通语言的准确转换。创建过程中,研究人员采用了一种基于启发式特征的句子选择方法,确保数据集的质量和适用性。MedLane数据集的应用领域主要集中在医疗信息技术和自然语言处理,旨在提高患者对医疗信息的理解能力,促进医患沟通。
The MedLane dataset was developed by the College of Information Sciences and Technology at Pennsylvania State University, focusing on clinical language simplification tasks. It contains 12,801 training samples, aiming to support the development and evaluation of automated clinical language simplification methods. The dataset provides not only aligned sentence pairs but also term-level annotations, ensuring accurate conversion from specialized clinical terminology to layperson's language. During its development, researchers adopted a heuristic feature-based sentence selection method to guarantee the dataset's quality and applicability. The application domains of the MedLane dataset mainly cover medical informatics and natural language processing (NLP), with the core goals of improving patients' understanding of medical information and facilitating doctor-patient communication.
提供机构:
宾夕法尼亚州立大学信息科学与技术学院
创建时间:
2020-12-04



