five

BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language

收藏
physionet.org2025-03-25 收录
下载链接:
https://physionet.org/content/brateca/1.1/
下载链接
链接失效反馈
官方服务:
资源简介:
Computational medicine research requires clinical data for training and testing purposes, so the development of datasets composed of real hospital data is of utmost importance in this field. Most such data collections are in the English language, were collected in anglophone countries, and do not reflect other clinical realities, which increases the importance of such national datasets for projects that hope to positively impact public health. This paper presents a new Brazilian Clinical Dataset containing over 70,000 admissions from 10 hospitals in two Brazilian states, composed of a sum total of over 2.5 million free-text clinical notes alongside data pertaining to patient descriptors, prescription information, and exam results. This data was collected, organized, deidentified, and is being distributed via credentialed access for the use of the research community. In the course of presenting the new dataset, we explore the new dataset’s structure, population, and potential benefits for use in clinical AI tasks.

计算医学研究需依赖临床数据以供训练与验证,因此,构建由真实医院数据构成的数据库在此领域显得尤为关键。此类数据集大多采用英语编写,并在英语国家收集,未能充分反映其他临床现实,因而此类国家级数据集对于旨在积极影响公共卫生的项目而言,其重要性愈发凸显。本文介绍了一种新的巴西临床数据库,该数据库汇集了来自巴西两个州10家医院的超过70,000份入院记录,包含总计超过2,500万份自由文本临床笔记,以及与患者描述、处方信息和检查结果相关的数据。这些数据经过收集、整理、脱敏处理,并通过认证访问方式向研究界进行分发。在介绍这一新数据库的过程中,我们探讨了其结构、人群以及应用于临床人工智能任务中的潜在益处。
提供机构:
physionet.org
二维码
社区交流群
二维码
科研交流群
商业服务