Malikeh1375/medical-question-answering-datasets
收藏Hugging Face2023-11-02 更新2024-06-15 收录
下载链接:
https://hf-mirror.com/datasets/Malikeh1375/medical-question-answering-datasets
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
task_categories:
- question-answering
tags:
- medical
- clinical
- healthcare
dataset_info:
- config_name: all-processed
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
- name: __index_level_0__
dtype: int64
splits:
- name: train
num_bytes: 276980695
num_examples: 246678
download_size: 0
dataset_size: 276980695
- config_name: chatdoctor_healthcaremagic
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 126454896
num_examples: 112165
download_size: 70518147
dataset_size: 126454896
- config_name: chatdoctor_icliniq
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 7347194
num_examples: 7321
download_size: 4153680
dataset_size: 7347194
- config_name: medical_meadow_cord19
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 1336834621
num_examples: 821007
download_size: 752855706
dataset_size: 1336834621
- config_name: medical_meadow_health_advice
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 2196957
num_examples: 8676
download_size: 890725
dataset_size: 2196957
- config_name: medical_meadow_medical_flashcards
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 16453987
num_examples: 33955
download_size: 6999958
dataset_size: 16453987
- config_name: medical_meadow_mediqa
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 15690088
num_examples: 2208
download_size: 3719929
dataset_size: 15690088
- config_name: medical_meadow_medqa
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 10225018
num_examples: 10178
download_size: 5505473
dataset_size: 10225018
- config_name: medical_meadow_mmmlu
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 1442124
num_examples: 3787
download_size: 685604
dataset_size: 1442124
- config_name: medical_meadow_pubmed_causal
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 846695
num_examples: 2446
download_size: 210947
dataset_size: 846695
- config_name: medical_meadow_wikidoc
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 10224074
num_examples: 10000
download_size: 5593178
dataset_size: 10224074
- config_name: medical_meadow_wikidoc_patient_information
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 3262558
num_examples: 5942
download_size: 1544286
dataset_size: 3262558
configs:
- config_name: all-processed
data_files:
- split: train
path: all-processed/train-*
- config_name: chatdoctor_healthcaremagic
data_files:
- split: train
path: chatdoctor_healthcaremagic/train-*
- config_name: chatdoctor_icliniq
data_files:
- split: train
path: chatdoctor_icliniq/train-*
- config_name: medical_meadow_cord19
data_files:
- split: train
path: medical_meadow_cord19/train-*
- config_name: medical_meadow_health_advice
data_files:
- split: train
path: medical_meadow_health_advice/train-*
- config_name: medical_meadow_medical_flashcards
data_files:
- split: train
path: medical_meadow_medical_flashcards/train-*
- config_name: medical_meadow_mediqa
data_files:
- split: train
path: medical_meadow_mediqa/train-*
- config_name: medical_meadow_medqa
data_files:
- split: train
path: medical_meadow_medqa/train-*
- config_name: medical_meadow_mmmlu
data_files:
- split: train
path: medical_meadow_mmmlu/train-*
- config_name: medical_meadow_pubmed_causal
data_files:
- split: train
path: medical_meadow_pubmed_causal/train-*
- config_name: medical_meadow_wikidoc
data_files:
- split: train
path: medical_meadow_wikidoc/train-*
- config_name: medical_meadow_wikidoc_patient_information
data_files:
- split: train
path: medical_meadow_wikidoc_patient_information/train-*
---
This dataset focuses on question-answering tasks in the medical field, including multiple subsets such as chatdoctor_healthcaremagic, medical_meadow_cord19, etc. Each subset has specific data features and training sets. The data types are mainly string and integer, suitable for natural language processing tasks related to healthcare.
提供机构:
Malikeh1375
原始信息汇总
数据集概述
语言和任务类别
- 语言: 英语 (en)
- 任务类别: 问答 (question-answering)
标签
- 医疗 (medical)
- 临床 (clinical)
- 健康护理 (healthcare)
数据集配置信息
配置: all-processed
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- index_level_0: 整数 (int64)
- 分割:
- train:
- 字节数: 276980695
- 样本数: 246678
- train:
- 下载大小: 0
- 数据集大小: 276980695
配置: chatdoctor_healthcaremagic
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 126454896
- 样本数: 112165
- train:
- 下载大小: 70518147
- 数据集大小: 126454896
配置: chatdoctor_icliniq
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 7347194
- 样本数: 7321
- train:
- 下载大小: 4153680
- 数据集大小: 7347194
配置: medical_meadow_cord19
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 1336834621
- 样本数: 821007
- train:
- 下载大小: 752855706
- 数据集大小: 1336834621
配置: medical_meadow_health_advice
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 2196957
- 样本数: 8676
- train:
- 下载大小: 890725
- 数据集大小: 2196957
配置: medical_meadow_medical_flashcards
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 16453987
- 样本数: 33955
- train:
- 下载大小: 6999958
- 数据集大小: 16453987
配置: medical_meadow_mediqa
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 15690088
- 样本数: 2208
- train:
- 下载大小: 3719929
- 数据集大小: 15690088
配置: medical_meadow_medqa
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 10225018
- 样本数: 10178
- train:
- 下载大小: 5505473
- 数据集大小: 10225018
配置: medical_meadow_mmmlu
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 1442124
- 样本数: 3787
- train:
- 下载大小: 685604
- 数据集大小: 1442124
配置: medical_meadow_pubmed_causal
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 846695
- 样本数: 2446
- train:
- 下载大小: 210947
- 数据集大小: 846695
配置: medical_meadow_wikidoc
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 10224074
- 样本数: 10000
- train:
- 下载大小: 5593178
- 数据集大小: 10224074
配置: medical_meadow_wikidoc_patient_information
- 特征:
- instruction: 字符串 (string)
- input: 字符串 (string)
- output: 字符串 (string)
- 分割:
- train:
- 字节数: 3262558
- 样本数: 5942
- train:
- 下载大小: 1544286
- 数据集大小: 3262558



