WinterSchool/MideficsDataset
收藏Hugging Face2024-03-05 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/WinterSchool/MideficsDataset
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: id
dtype: string
- name: image
dtype: image
- name: conversation
struct:
- name: data
list:
- name: answer
dtype: string
- name: question
dtype: string
splits:
- name: train
num_bytes: 2132989003.9490128
num_examples: 3800
- name: test
num_bytes: 112823892.05098726
num_examples: 201
download_size: 2244437082
dataset_size: 2245812896
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
task_categories:
- question-answering
- visual-question-answering
language:
- en
tags:
- medical
- image
- image-to-text
pretty_name: Midefics conversational dataset
size_categories:
- 1K<n<10K
---
MideficsDataset is a dataset of conversations on radiology and skin cancer images. The dataset is intended to be used for training and testing Medical Visual Question Answering (VQA) systems.
The dataset is built from [MURA](https://arxiv.org/abs/1712.06957), [ISIC](https://www.isic-archive.com/) and [ROCO](https://www.semanticscholar.org/paper/Radiology-Objects-in-COntext-(ROCO)%3A-A-Multimodal-Pelka-Koitka/a564fabf130ff6e2742cfba90c7a4018937d764d) which are free open-access online datasets of medical images.
The conversations were generated using GPT-3.5-turbo based on metadata associated to each image.
提供机构:
WinterSchool
原始信息汇总
数据集概述
数据集信息
-
特征:
id: 类型为字符串。image: 类型为图像。conversation: 结构化数据,包含以下列表项:data: 包含以下列表项:answer: 类型为字符串。question: 类型为字符串。
-
拆分:
train: 字节数为 2132989003.9490128,样本数为 3800。test: 字节数为 112823892.05098726,样本数为 201。
-
下载大小: 2244437082 字节。
-
数据集大小: 2245812896 字节。
配置
- 默认配置:
train数据文件路径:data/train-*test数据文件路径:data/test-*
任务类别
- 问答
- 视觉问答
语言
- 英语
标签
- 医疗
- 图像
- 图像到文本
名称
- 数据集的友好名称: Midefics conversational dataset
大小类别
- 1K<n<10K



