OCT5k: A dataset of multi-disease and multi-graded annotations for retinal layers

Name: OCT5k: A dataset of multi-disease and multi-graded annotations for retinal layers
Creator: University College London
Published: 2024-11-07 08:12:55
License: 暂无描述

DataCite Commons2024-11-07 更新2024-07-13 收录

下载链接：

https://rdr.ucl.ac.uk/articles/dataset/OCT5k_A_dataset_of_multi-disease_and_multi-graded_annotations_for_retinal_layers/22128671/3

下载链接

链接失效反馈

官方服务：

资源简介：

The thickness and appearance of retinal layers are essential markers for diagnosing and studying eye diseases. Despite the increasing availability of imaging devices to scan and store large amounts of data, analyzing retinal images and generating trial endpoints has remained a manual, error-prone, and time-consuming task. In particular, the lack of large amounts of high-quality labels for different diseases hinders the development of automated algorithms. Therefore, we have compiled 5016 pixel-wise manual labels for 1672 optical coherence tomography (OCT) scans featuring two different diseases as well as healthy subjects to help democratize the process of developing novel automatic techniques. We also collected 4698 bounding box annotations for a subset of 566 scans across 9 classes of disease biomarker. Due to variations in retinal morphology, intensity range, and changes in contrast and brightness, designing segmentation and detection methods that can generalize to different disease types is challenging. While machine learning-based methods can overcome these challenges, high-quality expert annotations are necessary for training. Publicly available annotated image datasets typically contain few images and/or only cover a single type of disease, and most are only annotated by a single grader. To address this gap, we present a comprehensive multi-grader and multi-disease dataset fortraining machine learning-based algorithms. The proposed dataset covers three subsets of scans (Age-related Macular Degeneration, Diabetic Macular Edema, and healthy) and annotations for two types of tasks (semantic segmentation and object detection).

视网膜各层的厚度与外观是诊断和研究眼部疾病的关键标志物。尽管当前用于扫描并存储海量数据的成像设备日益普及，但分析视网膜图像并生成试验终点仍需依赖人工操作，不仅易出错，还耗时耗力。尤为关键的是，针对不同疾病的大规模高质量标注数据匮乏，这阻碍了自动化算法的研发进程。为此，我们针对1672例光学相干断层扫描（OCT）影像标注了5016个像素级手动标注结果，这些影像涵盖两种眼部疾病以及健康受试者样本，以期推动新型自动化技术开发流程的普及化。此外，我们还针对覆盖9类疾病生物标志物的566例扫描影像子集，收集了4698个边界框标注结果。由于视网膜形态、强度范围存在差异，且对比度与亮度会发生变化，设计可泛化至不同疾病类型的分割与检测方法颇具挑战。基于机器学习的方法虽可应对上述难题，但训练此类算法需要高质量的专家标注数据。目前公开可用的标注影像数据集通常样本量较少，且/或仅覆盖单一疾病类型，多数仅由单一标注员完成标注。为填补这一研究空白，我们构建了一个涵盖多标注员、多疾病类型的综合数据集，用于训练基于机器学习的算法。本数据集包含三类扫描影像子集（年龄相关性黄斑变性、糖尿病性黄斑水肿及健康样本），并支持语义分割与目标检测两类任务的标注。

提供机构：

University College London

创建时间：

2024-02-09

搜集汇总

数据集介绍

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集