Medical Images, Captions, and Textual References Dataset医学影像和标题的语篇照应数据集MedICaT

Name: Medical Images, Captions, and Textual References Dataset医学影像和标题的语篇照应数据集MedICaT
Creator: 阿里云天池
Published: 2026-06-05 17:39:10
License: 暂无描述

阿里云天池2026-06-05 更新2024-03-07 收录

下载链接：

https://tianchi.aliyun.com/dataset/83729

下载链接

链接失效反馈

官方服务：

资源简介：

MedICaT is a dataset of medical images, captions, subfigure-subcaption annotations, and inline textual references. Figures and captions are extracted from open access articles in PubMed Central and corresponding reference text is derived from S2ORC. 本数据集主要用于研究医学影像与标题、子图或子标题标注的内联语篇照应问题。图表和标题均由PubMed Central上开源文章摘取，对应的照应语篇则来自S2ORC。

MedICaT是一个涵盖医学图像 (medical images)、图注 (captions)、子图-子图注标注 (subfigure-subcaption annotations) 以及内嵌文本引用 (inline textual references) 的数据集，主要用于探究医学影像与图注、子图或子图注标注之间的内嵌语篇照应关系。该数据集的图像与图注均提取自PubMed Central的开源学术文章，对应的引用文本则源自S2ORC。

提供机构：

阿里云天池

创建时间：

2020-11-21

搜集汇总

数据集介绍