five

IPATH Dataset: 45,609 Curated Image-Text Pairs for Histopathology Applications

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14278845
下载链接
链接失效反馈
官方服务:
资源简介:
Recent advancements in artificial intelligence (AI) have enabled the identification of patterns in pathology images, improving diagnostic accuracy and decision support systems. However, progress has been limited due to the lack of publicly available medical images. To address this scarcity, we explore Instagram as a novel source of pathology images with expert annotations. We curated the IPATH dataset from Instagram, comprising 45,609 pathology image-text pairs, using a combination of classifiers, large language models, and manual filtering. To demonstrate the value of this dataset, we developed a multimodal AI model called IP-CLIP by fine-tuning the pre-trained CLIP model using the IPATH dataset. IP-CLIP outperforms the original CLIP model in classifying new pathology images on two downstream tasks—zero-shot classification and linear probing—using two external histopathology datasets. These results surpass the CLIP baseline model and demonstrate the effectiveness of the IPATH dataset, highlighting the potential of social media data to advance AI models for medical image classification.
创建时间:
2024-12-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作