pucpr-br/nestedclinbr
收藏Hugging Face2025-07-20 更新2025-08-09 收录
下载链接:
https://hf-mirror.com/datasets/pucpr-br/nestedclinbr
下载链接
链接失效反馈官方服务:
资源简介:
NestedClinBr是一个包含巴西葡萄牙语临床叙事中的嵌套和不连续实体的语料库。该语料库的主要目标是提供一个由人工注释的语料库,用于学习和评估不同的机器学习模型,以提取葡萄牙语中的有价值医疗信息,特别是嵌套和不连续实体,这是一个重要但较少探索的任务。语料库包含问题、治疗、测试和解剖学等实体类型,并提供了训练集和测试集的统计数据。该语料库的最终互注者一致性(IAA)值为94.08%,表明注释者之间有实质性的一致性。
NestedClinBr is a corpus containing nested and discontinuous entities in Brazilian Portuguese clinical narratives. The main goal of NestedClinBr is to provide a human-annotated corpus for learning and evaluating different machine learning models to extract valuable medical information in the Portuguese language, with a focus on nested and discontinuous entities. The corpus includes entity types such as Problem, Treatment, Test, and Anatomy, and provides statistics for both training and test sets. The final inter-annotator agreement (IAA) value for the corpus is 94.08%, indicating substantial agreement among annotators.
提供机构:
pucpr-br



