XMedbench
收藏魔搭社区2025-12-05 更新2025-01-25 收录
下载链接:
https://modelscope.cn/datasets/FreedomIntelligence/XMedbench
下载链接
链接失效反馈官方服务:
资源简介:
# Multilingual Medicine: Model, Dataset, Benchmark, Code
Covering English, Chinese, French, Hindi, Spanish, Hindi, Arabic So far
<p align="center">
👨🏻💻<a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Github</a> •📃 <a href="https://arxiv.org/abs/2403.03640" target="_blank">Paper</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus" target="_blank">ApolloCorpus</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/XMedbench" target="_blank">XMedBench</a>
<br> <a href="./README_zh.md"> 中文 </a> | <a href="./README.md"> English
</p>

## 🌈 Update
* **[2024.03.07]** [Paper](https://arxiv.org/abs/2403.03640) released.
* **[2024.02.12]** <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus" target="_blank">ApolloCorpus</a> and <a href="https://huggingface.co/datasets/FreedomIntelligence/XMedbench" target="_blank">XMedBench</a> is published!🎉
* **[2024.01.23]** Apollo repo is published!🎉
## Results

## Usage
- [Zip File](https://huggingface.co/datasets/FreedomIntelligence/XMedbench/blob/main/XMedbench.zip)
- [Data category](https://huggingface.co/datasets/FreedomIntelligence/XMedbench/tree/main/test)
## Data:
- EN:
- [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options)
- [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test)
- [PubMedQA](https://huggingface.co/datasets/pubmed_qa): Because the results fluctuated too much, they were not used in the paper.
- [MMLU-Medical](https://huggingface.co/datasets/cais/mmlu)
- Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine
- ZH:
- [MedQA-MCMLE](https://huggingface.co/datasets/bigbio/med_qa/viewer/med_qa_zh_4options_bigbio_qa/test)
- [CMB-single](https://huggingface.co/datasets/FreedomIntelligence/CMB): Not used in the paper
- Randomly sample 2,000 multiple-choice questions with single answer.
- [CMMLU-Medical](https://huggingface.co/datasets/haonan-li/cmmlu)
- Anatomy, Clinical_knowledge, College_medicine, Genetics, Nutrition, Traditional_chinese_medicine, Virology
- [CExam](https://github.com/williamliujl/CMExam): Not used in the paper
- Randomly sample 2,000 multiple-choice questions
- ES: [Head_qa](https://huggingface.co/datasets/head_qa)
- FR: [Frenchmedmcqa](https://github.com/qanastek/FrenchMedMCQA)
- HI: [MMLU_HI](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Arabic)
- Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine
- AR: [MMLU_Ara](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Hindi)
- Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine
## Citation
Please use the following citation if you intend to use our dataset for training or evaluation:
```
@misc{wang2024apollo,
title={Apollo: Lightweight Multilingual Medical LLMs towards Democratizing Medical AI to 6B People},
author={Xidong Wang and Nuo Chen and Junyin Chen and Yan Hu and Yidong Wang and Xiangbo Wu and Anningzhe Gao and Xiang Wan and Haizhou Li and Benyou Wang},
year={2024},
eprint={2403.03640},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
# 多语言医学:模型、数据集、基准测试与代码
目前覆盖英语、中文、法语、印地语、西班牙语、阿拉伯语
<p align="center">
👨🏻💻<a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">GitHub</a> •📃 <a href="https://arxiv.org/abs/2403.03640" target="_blank">论文</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus" target="_blank">ApolloCorpus</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/XMedbench" target="_blank">XMedBench</a>
<br> <a href="./README_zh.md"> 中文 </a> | <a href="./README.md"> 英文
</p>

## 🌈 更新动态
* **[2024.03.07]** 相关论文<a href="https://arxiv.org/abs/2403.03640" target="_blank">发布</a>。
* **[2024.02.12]** <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus" target="_blank">ApolloCorpus</a> 与 <a href="https://huggingface.co/datasets/FreedomIntelligence/XMedbench" target="_blank">XMedBench</a> 正式发布!🎉
* **[2024.01.23]** Apollo 代码仓库正式发布!🎉
## 实验结果

## 使用方式
- [压缩包](https://huggingface.co/datasets/FreedomIntelligence/XMedbench/blob/main/XMedbench.zip)
- [数据分类](https://huggingface.co/datasets/FreedomIntelligence/XMedbench/tree/main/test)
## 数据集详情:
- 英语(EN):
- [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options)
- [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test)
- [PubMedQA](https://huggingface.co/datasets/pubmed_qa):因实验结果波动幅度过大,本文未采用该数据集。
- [MMLU-Medical](https://huggingface.co/datasets/cais/mmlu)
- 涵盖临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学
- 中文(ZH):
- [MedQA-MCMLE](https://huggingface.co/datasets/bigbio/med_qa/viewer/med_qa_zh_4options_bigbio_qa/test)
- [CMB-single](https://huggingface.co/datasets/FreedomIntelligence/CMB):本文未采用该数据集
- 随机抽取2000道单项选择题。
- [CMMLU-Medical](https://huggingface.co/datasets/haonan-li/cmmlu)
- 涵盖解剖学、临床知识、大学医学、遗传学、营养学、中医、病毒学
- [CExam](https://github.com/williamliujl/CMExam):本文未采用该数据集
- 随机抽取2000道多项选择题。
- 西班牙语(ES):[Head_qa](https://huggingface.co/datasets/head_qa)
- 法语(FR):[Frenchmedmcqa](https://github.com/qanastek/FrenchMedMCQA)
- 印地语(HI):[MMLU_HI](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Arabic)
- 涵盖临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学
- 阿拉伯语(AR):[MMLU_Ara](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Hindi)
- 涵盖临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学
## 引用规范
若将本数据集用于模型训练或评估,请采用以下引用格式:
@misc{wang2024apollo,
title={Apollo: Lightweight Multilingual Medical LLMs towards Democratizing Medical AI to 6B People},
author={Xidong Wang and Nuo Chen and Junyin Chen and Yan Hu and Yidong Wang and Xiangbo Wu and Anningzhe Gao and Xiang Wan and Haizhou Li and Benyou Wang},
year={2024},
eprint={2403.03640},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
提供机构:
maas
创建时间:
2025-01-20



