CANDID-III Dataset
收藏auckland.figshare.com2024-04-19 更新2025-01-21 收录
下载链接:
https://auckland.figshare.com/articles/dataset/CANDID-III_Dataset/22726004/1
下载链接
链接失效反馈官方服务:
资源简介:
288,776 anonymized adult chest x-ray dataset in 1024 x 1024 pixel DICOM format with corresponding anonymized free-text reports from Dunedin Hospital, New Zealand between 2010 - 2020. Corresponding radiology reports generated by FRANZCR radiologists were manually annotated for 45 common radiological findings mapped to Unified Medical Language System (UMLS) ontology. Each of the multiclassification annotations contains 4 types of labels, namely positive, uncertain, negative and not mentioned. 33,486 studies were manually labeled. 255,290 were labeled by deep learning models. Accuracy of the AI labeled portion of the dataset with respect to each label will be outlined in the published paper. In the provided dataset, image filenames contain patient index (enabling analysis requiring grouping of images by patients), as well as anonymized date of acquisition information where the temporal relationship between images is preserved. This dataset can be used for training and testing for deep learning algorithms for adult chest x rays.To access the data, an ethics training process is required and is divided into 2 steps:1. An online ethics course at https://globalhealthtrainingcentre.tghn.org/ethics-and-best-practices-sharing-individual-level-data-clinical-and-public-health-research/. You will need to register an account to be able to take the free online ethics course. Once you finished the course quiz, please send the course certificate to sijingfeng@gmail.com2. Signing the Data Use Agreement. It can be accessed at Data Use Agreement- Unanonymised data.pdf. Once you signed the Data Use Agreement, please also send the signed copy to sijingfeng@gmail.comAfter successfully completion of both of above steps, a private link to download the dataset will be sent.
本数据集包含新西兰达尼丁医院2010年至2020年间匿名化的288,776份成人胸部X射线图像,图像分辨率为1024 x 1024像素的DICOM格式,并附带相应的匿名化自由文本报告。这些报告由FRANZCR放射科医生生成,并经人工标注了45种常见的放射学发现,这些发现与统一医学语言系统(UMLS)本体对应。每项多分类标注包含四种标签类型,即阳性、不确定、阴性和未提及。其中33,486份研究由人工标注,255,290份由深度学习模型标注。数据集中AI标注部分的准确率将详细概述于即将发表的论文中。在本数据集中,图像文件名包含患者索引(便于按患者分组分析图像),以及匿名化的采集日期信息,以保留图像之间的时间关系。此数据集可用于训练和测试针对成人胸部X射线的深度学习算法。为访问数据,需完成伦理培训流程,该流程分为两个步骤:1. 在https://globalhealthtrainingcentre.tghn.org/ethics-and-best-practices-sharing-individual-level-data-clinical-and-public-health-research/完成在线伦理课程。您需要注册账户以参加免费的在线伦理课程。完成课程测验后,请将课程证书发送至sijingfeng@gmail.com。2. 签署数据使用协议。该协议可在Data Use Agreement- Unanonymised data.pdf中获取。签署后,请也将签署副本发送至sijingfeng@gmail.com。在成功完成上述两个步骤后,将发送一个私人链接以下载数据集。
提供机构:
figshare



