five

XMedbench

收藏
魔搭社区2025-12-05 更新2025-01-25 收录
下载链接:
https://modelscope.cn/datasets/FreedomIntelligence/XMedbench
下载链接
链接失效反馈
官方服务:
资源简介:
# Multilingual Medicine: Model, Dataset, Benchmark, Code Covering English, Chinese, French, Hindi, Spanish, Hindi, Arabic So far <p align="center"> 👨🏻‍💻<a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Github</a> •📃 <a href="https://arxiv.org/abs/2403.03640" target="_blank">Paper</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus" target="_blank">ApolloCorpus</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/XMedbench" target="_blank">XMedBench</a> <br> <a href="./README_zh.md"> 中文 </a> | <a href="./README.md"> English </p> ![XMedBench](assets/XMedBench.png) ## 🌈 Update * **[2024.03.07]** [Paper](https://arxiv.org/abs/2403.03640) released. * **[2024.02.12]** <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus" target="_blank">ApolloCorpus</a> and <a href="https://huggingface.co/datasets/FreedomIntelligence/XMedbench" target="_blank">XMedBench</a> is published!🎉 * **[2024.01.23]** Apollo repo is published!🎉 ## Results ![Apollo](assets/result.png) ## Usage - [Zip File](https://huggingface.co/datasets/FreedomIntelligence/XMedbench/blob/main/XMedbench.zip) - [Data category](https://huggingface.co/datasets/FreedomIntelligence/XMedbench/tree/main/test) ## Data: - EN: - [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options) - [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test) - [PubMedQA](https://huggingface.co/datasets/pubmed_qa): Because the results fluctuated too much, they were not used in the paper. - [MMLU-Medical](https://huggingface.co/datasets/cais/mmlu) - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - ZH: - [MedQA-MCMLE](https://huggingface.co/datasets/bigbio/med_qa/viewer/med_qa_zh_4options_bigbio_qa/test) - [CMB-single](https://huggingface.co/datasets/FreedomIntelligence/CMB): Not used in the paper - Randomly sample 2,000 multiple-choice questions with single answer. - [CMMLU-Medical](https://huggingface.co/datasets/haonan-li/cmmlu) - Anatomy, Clinical_knowledge, College_medicine, Genetics, Nutrition, Traditional_chinese_medicine, Virology - [CExam](https://github.com/williamliujl/CMExam): Not used in the paper - Randomly sample 2,000 multiple-choice questions - ES: [Head_qa](https://huggingface.co/datasets/head_qa) - FR: [Frenchmedmcqa](https://github.com/qanastek/FrenchMedMCQA) - HI: [MMLU_HI](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Arabic) - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - AR: [MMLU_Ara](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Hindi) - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine ## Citation Please use the following citation if you intend to use our dataset for training or evaluation: ``` @misc{wang2024apollo, title={Apollo: Lightweight Multilingual Medical LLMs towards Democratizing Medical AI to 6B People}, author={Xidong Wang and Nuo Chen and Junyin Chen and Yan Hu and Yidong Wang and Xiangbo Wu and Anningzhe Gao and Xiang Wan and Haizhou Li and Benyou Wang}, year={2024}, eprint={2403.03640}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```

# 多语言医学:模型、数据集、基准测试与代码 目前覆盖英语、中文、法语、印地语、西班牙语、阿拉伯语 <p align="center"> 👨🏻‍💻<a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">GitHub</a> •📃 <a href="https://arxiv.org/abs/2403.03640" target="_blank">论文</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus" target="_blank">ApolloCorpus</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/XMedbench" target="_blank">XMedBench</a> <br> <a href="./README_zh.md"> 中文 </a> | <a href="./README.md"> 英文 </p> ![XMedBench](assets/XMedBench.png) ## 🌈 更新动态 * **[2024.03.07]** 相关论文<a href="https://arxiv.org/abs/2403.03640" target="_blank">发布</a>。 * **[2024.02.12]** <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus" target="_blank">ApolloCorpus</a> 与 <a href="https://huggingface.co/datasets/FreedomIntelligence/XMedbench" target="_blank">XMedBench</a> 正式发布!🎉 * **[2024.01.23]** Apollo 代码仓库正式发布!🎉 ## 实验结果 ![Apollo](assets/result.png) ## 使用方式 - [压缩包](https://huggingface.co/datasets/FreedomIntelligence/XMedbench/blob/main/XMedbench.zip) - [数据分类](https://huggingface.co/datasets/FreedomIntelligence/XMedbench/tree/main/test) ## 数据集详情: - 英语(EN): - [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options) - [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test) - [PubMedQA](https://huggingface.co/datasets/pubmed_qa):因实验结果波动幅度过大,本文未采用该数据集。 - [MMLU-Medical](https://huggingface.co/datasets/cais/mmlu) - 涵盖临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学 - 中文(ZH): - [MedQA-MCMLE](https://huggingface.co/datasets/bigbio/med_qa/viewer/med_qa_zh_4options_bigbio_qa/test) - [CMB-single](https://huggingface.co/datasets/FreedomIntelligence/CMB):本文未采用该数据集 - 随机抽取2000道单项选择题。 - [CMMLU-Medical](https://huggingface.co/datasets/haonan-li/cmmlu) - 涵盖解剖学、临床知识、大学医学、遗传学、营养学、中医、病毒学 - [CExam](https://github.com/williamliujl/CMExam):本文未采用该数据集 - 随机抽取2000道多项选择题。 - 西班牙语(ES):[Head_qa](https://huggingface.co/datasets/head_qa) - 法语(FR):[Frenchmedmcqa](https://github.com/qanastek/FrenchMedMCQA) - 印地语(HI):[MMLU_HI](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Arabic) - 涵盖临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学 - 阿拉伯语(AR):[MMLU_Ara](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Hindi) - 涵盖临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学 ## 引用规范 若将本数据集用于模型训练或评估,请采用以下引用格式: @misc{wang2024apollo, title={Apollo: Lightweight Multilingual Medical LLMs towards Democratizing Medical AI to 6B People}, author={Xidong Wang and Nuo Chen and Junyin Chen and Yan Hu and Yidong Wang and Xiangbo Wu and Anningzhe Gao and Xiang Wan and Haizhou Li and Benyou Wang}, year={2024}, eprint={2403.03640}, archivePrefix={arXiv}, primaryClass={cs.CL} }
提供机构:
maas
创建时间:
2025-01-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作