five

Skin-Path

收藏
ieee-dataport.org2025-03-26 收录
下载链接:
https://ieee-dataport.org/documents/skin-path
下载链接
链接失效反馈
官方服务:
资源简介:
Vision-language (VL) datasets are essential for advancing the capabilities of VL models, particularly in specialized domains like medical imaging. However, existing medical VL datasets are relatively small and predominantly focus on chest X-rays, limiting their applicability to other areas. To address this gap, we introduce the Skin-Path dataset, a comprehensive VL dataset specifically curated for histopathology. This dataset comprises 194 H&E-stained whole slide images (WSIs) from distinct patients, digitized at 20x magnification and annotated with diagnostic reports by senior pathologists. From these WSIs, we extracted 277,761 image patches, each sized 300×300 pixels, accompanied by corresponding captions. The Skin-Path dataset covers 10 distinct skin diseases, including seborrhoeic keratosis, basal cell carcinoma, and squamous cell carcinoma. Our analysis demonstrates significant diversity in the dataset, with a unique word distribution distinct from general VL datasets, as visualized through word clouds. This dataset provides a robust foundation for training and evaluating VL models tailored for histopathological applications.

视觉-语言(Vision-language,简称VL)数据集对于提升视觉-语言模型的性能至关重要,尤其是在医学影像等特定领域。然而,现有的医学视觉-语言数据集规模相对较小,且主要集中于胸部X光片,这限制了其在其他领域的应用。为填补这一空白,我们推出了Skin-Path数据集,这是一个专为组织病理学而精心构建的全面视觉-语言数据集。该数据集包含来自不同患者的194张H&E染色全切片图像(Whole Slide Images,简称WSIs),以20倍放大倍数进行数字化,并由资深病理学家提供诊断报告进行标注。从这些WSIs中,我们提取了277,761个图像块,每个图像块尺寸为300×300像素,并配有相应的说明文字。Skin-Path数据集涵盖了10种不同的皮肤病,包括脂溢性角化病、基底细胞癌和鳞状细胞癌。我们的分析显示,该数据集在多样性方面具有显著特点,其独特的词汇分布与一般视觉-语言数据集有所不同,这通过词云得到了直观展示。该数据集为针对组织病理学应用的视觉-语言模型的训练和评估提供了一个坚实的基础。
提供机构:
ieee-dataport.org
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作