five

GastroVL, a multimodal dataset of 10,000 Esophagogastroduodenoscopy images with structured text descriptions

收藏
DataCite Commons2026-05-02 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19984390
下载链接
链接失效反馈
官方服务:
资源简介:
GastroVL is a high‑quality multimodal dataset comprising 10,000 esophagogastroduodenoscopy (EGD) images paired with structured text descriptions, covering ten anatomical sites of the upper gastrointestinal tract. Rigorous multi‑stage expert annotation ensured high reliability: senior endoscopists independently performed anatomical classification and text generation, with discrepancies resolved through expert adjudication. Each structured description captures standardized features including location, morphology, size, margin, and colour, making the dataset directly suitable for training multimodal large language models (MLLMs) in tasks such as automated report generation, cross‑modal retrieval, and fine‑grained lesion understanding. In detail, the dataset contains ten folders, each named after a specific anatomical location identified by a two‑digit numeric code: “01 esophagus”, “02 Z-line”, “03 cardia”, “04 fundus of stomach”, “05 body of stomach”, “06 incisura angularis”, “07 gastric antrum”, “08 pylorus”, “09 duodenal bulb”, and “10 descending duodenum”. Within each anatomical location folder, two subfolders are named with a three‑digit code: the first two digits encode the anatomical site, and the third digit indicates the clinical class (0=normal, 1=abnormal). Inside each class folder, two additional subfolders store the multimedia files: (1) image: contains the EGD images in PNG format (e.g., 0100001.png); (2) text: contains the corresponding text descriptions in TXT format (e.g., 0100001.txt). Although the images and their corresponding text descriptions share identical base filenames to ensure straightforward pairing, we additionally provide an index file named index.csv at the root directory to enable more efficient programmatic access.
提供机构:
Zenodo
创建时间:
2026-05-02
二维码
社区交流群
二维码
科研交流群
商业服务