OCT5k: A dataset of multi-disease and multi-graded annotations for retinal layers

Name: OCT5k: A dataset of multi-disease and multi-graded annotations for retinal layers
Creator: University College London
Published: 2023-11-24 19:48:53
License: 暂无描述

DataCite Commons2023-11-24 更新2025-04-17 收录

下载链接：

https://rdr.ucl.ac.uk/articles/dataset/OCT5k_A_dataset_of_multi-disease_and_multi-graded_annotations_for_retinal_layers/22128671/1

下载链接

链接失效反馈

官方服务：

资源简介：

The thickness and appearance of retinal layers are essential markers for diagnosing and studying eye diseases. Despite the increasing availability of imaging devices to scan and store large amounts of data, analyzing retinal images and generating trial endpoints has remained a manual, error-prone, and time-consuming task. In particular, the lack of large amounts of high-quality labels for different diseases hinders the development of automated algorithms. Therefore, we have compiled 5016 pixel-wise manual labels for 1672 optical coherence tomography (OCT) scans featuring two different diseases as well as healthy subjects to help democratize the process of developing novel automatic techniques. We also collected 4698 bounding box annotations for a subset of 566 scans across 9 classes of disease biomarker. Due to variations in retinal morphology, intensity range, and changes in contrast and brightness, designing segmentation and detection methods that can generalize to different disease types is challenging. While machine learning-based methods can overcome these challenges, high-quality expert annotations are necessary for training. Publicly available annotated image datasets typically contain few images and/or only cover a single type of disease, and most are only annotated by a single grader. To address this gap, we present a comprehensive multi-grader and multi-disease dataset fortraining machine learning-based algorithms. The proposed dataset covers three subsets of scans (Age-related Macular Degeneration, Diabetic Macular Edema, and healthy) and annotations for two types of tasks (semantic segmentation and object detection).

视网膜各层的厚度与外观是诊断及研究眼部疾病的核心标志物。尽管当前用于扫描与存储海量影像数据的成像设备日益普及，但视网膜图像分析与临床试验终点生成仍需依赖人工操作，不仅易引入误差，且耗时耗力。尤为突出的问题是，针对各类眼部疾病的高质量标注数据匮乏，这阻碍了自动化算法的研发进展。为此，我们针对1672例光学相干断层扫描（optical coherence tomography, OCT）影像构建了5016组逐像素手动标注，样本涵盖两种眼部疾病与健康受试者，旨在推动新型自动化技术研发流程的普惠化。此外，我们还针对其中566例扫描影像的子集，收集了覆盖9类疾病生物标志物的4698组边界框标注。由于视网膜形态、影像强度范围存在差异，且对比度与亮度存在变化，设计可泛化至不同疾病类型的分割与检测方法颇具挑战。基于机器学习的方法虽可应对上述难题，但训练过程需要高质量的专家标注。当前公开可用的标注影像数据集通常存在样本量有限、仅覆盖单一疾病类型的问题，且多数仅由单个标注者完成标注工作。为填补这一研究空白，我们构建了一套涵盖多标注者、多疾病类型的综合数据集，用于训练机器学习算法。本数据集包含三类扫描影像子集：年龄相关性黄斑变性、糖尿病性黄斑水肿与健康对照样本，同时支持两类任务的标注：语义分割与目标检测。

提供机构：

University College London

创建时间：

2023-04-14

5,000+

优质数据集

54 个

任务类型

进入经典数据集