ApolloMoEBench

Name: ApolloMoEBench
Creator: maas
Published: 2025-12-05 16:21:12
License: 暂无描述

魔搭社区2025-12-05 更新2025-01-25 收录

下载链接：

https://modelscope.cn/datasets/FreedomIntelligence/ApolloMoEBench

下载链接

链接失效反馈

官方服务：

资源简介：

# Democratizing Medical LLMs For Much More Languages Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish, Arabic, Russian, Japanese, Korean, German, Italian, Portuguese and 38 Minor Languages So far. <p align="center"> 📃 <a href="https://arxiv.org/abs/2410.10626" target="_blank">Paper</a> • 🌐 <a href="" target="_blank">Demo</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a> • 🤗 <a href="https://huggingface.co/collections/FreedomIntelligence/apollomoe-and-apollo2-670ddebe3bb1ba1aebabbf2c" target="_blank">Models</a> •🌐 <a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Apollo</a> • 🌐 <a href="https://github.com/FreedomIntelligence/ApolloMoE" target="_blank">ApolloMoE</a> </p> ![Apollo](assets/apollo_medium_final.png) ## 🌈 Update * **[2024.10.15]** ApolloMoE repo is published！🎉 ## Languages Coverage 12 Major Languages and 38 Minor Languages <details> <summary>Click to view the Languages Coverage</summary> ![ApolloMoE](assets/languages.png) </details> ## Architecture <details> <summary>Click to view the MoE routing image</summary> ![ApolloMoE](assets/hybrid_routing.png) </details> ## Results #### Dense 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-0.5B" target="_blank">Apollo2-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-1.5B" target="_blank">Apollo2-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-2B" target="_blank">Apollo2-2B</a> 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-3.8B" target="_blank">Apollo2-3.8B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-7B" target="_blank">Apollo2-7B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-9B" target="_blank">Apollo2-9B</a> <details> <summary>Click to view the Dense Models Results</summary> ![ApolloMoE](assets/dense_results.png) </details> #### Post-MoE 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-0.5B" target="_blank">Apollo-MoE-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-1.5B" target="_blank">Apollo-MoE-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-7B" target="_blank">Apollo-MoE-7B</a> <details> <summary>Click to view the Post-MoE Models Results</summary> ![ApolloMoE](assets/post_moe_results.png) </details> ## Usage Format ##### Apollo2 - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|> - 2B, 9B: User:{query}\nAssistant:{response}\<eos\> - 3.8B: <|user|>\n{query}<|end|><|assisitant|>\n{response}<|end|> ##### Apollo-MoE - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|> ## Dataset & Evaluation - Dataset 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> <details><summary>Click to expand</summary> ![ApolloMoE](assets/Dataset.png) - [Data category](https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus/tree/main/train) </details> - Evaluation 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a> <details><summary>Click to expand</summary> - EN: - [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options) - [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test) - [PubMedQA](https://huggingface.co/datasets/pubmed_qa): Because the results fluctuated too much, they were not used in the paper. - [MMLU-Medical](https://huggingface.co/datasets/cais/mmlu) - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - ZH: - [MedQA-MCMLE](https://huggingface.co/datasets/bigbio/med_qa/viewer/med_qa_zh_4options_bigbio_qa/test) - [CMB-single](https://huggingface.co/datasets/FreedomIntelligence/CMB): Not used in the paper - Randomly sample 2,000 multiple-choice questions with single answer. - [CMMLU-Medical](https://huggingface.co/datasets/haonan-li/cmmlu) - Anatomy, Clinical_knowledge, College_medicine, Genetics, Nutrition, Traditional_chinese_medicine, Virology - [CExam](https://github.com/williamliujl/CMExam): Not used in the paper - Randomly sample 2,000 multiple-choice questions - ES: [Head_qa](https://huggingface.co/datasets/head_qa) - FR: - [Frenchmedmcqa](https://github.com/qanastek/FrenchMedMCQA) - [MMLU_FR] - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - HI: [MMLU_HI](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Hindi) - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - AR: [MMLU_AR](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Arabic) - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - JA: [IgakuQA](https://github.com/jungokasai/IgakuQA) - KO: [KorMedMCQA](https://huggingface.co/datasets/sean0042/KorMedMCQA) - IT: - [MedExpQA](https://huggingface.co/datasets/HiTZ/MedExpQA) - [MMLU_IT] - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - DE: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): German part - PT: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): Portuguese part - RU: [RuMedBench](https://github.com/sb-ai-lab/MedBench) - Minor Langs: MMLU Translated Medical Part </details> ## Results reproduction <details><summary>Click to expand</summary> We take Apollo2-7B or Apollo-MoE-0.5B as example 1. Download Dataset for project: ``` bash 0.download_data.sh ``` 2. Prepare test and dev data for specific model: - Create test data for with special token ``` bash 1.data_process_test&dev.sh ``` 3. Prepare train data for specific model (Create tokenized data in advance): - You can adjust data Training order and Training Epoch in this step ``` bash 2.data_process_train.sh ``` 4. Train the model - If you want to train in Multi Nodes please refer to ./src/sft/training_config/zero_multi.yaml ``` bash 3.single_node_train.sh ``` 5. Evaluate your model: Generate score for benchmark ``` bash 4.eval.sh ``` </details> ## Citation Please use the following citation if you intend to use our dataset for training or evaluation: ``` @misc{zheng2024efficientlydemocratizingmedicalllms, title={Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts}, author={Guorui Zheng and Xidong Wang and Juhao Liang and Nuo Chen and Yuping Zheng and Benyou Wang}, year={2024}, eprint={2410.10626}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2410.10626}, } ```

# 推动医疗大语言模型（Large Language Model，LLM）面向多元语言的普惠落地目前已覆盖包括英语、汉语、法语、印地语、西班牙语、阿拉伯语、俄语、日语、韩语、德语、意大利语、葡萄牙语在内的12种主流语言，以及38种小众语言。 <p align="center"> 📃 <a href="https://arxiv.org/abs/2410.10626" target="_blank">论文</a> • 🌐 <a href="" target="_blank">演示系统</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a> • 🤗 <a href="https://huggingface.co/collections/FreedomIntelligence/apollomoe-and-apollo2-670ddebe3bb1ba1aebabbf2c" target="_blank">模型集合</a> •🌐 <a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Apollo</a> • 🌐 <a href="https://github.com/FreedomIntelligence/ApolloMoE" target="_blank">ApolloMoE</a> </p> ![Apollo](assets/apollo_medium_final.png) ## 🌈 更新 * **[2024.10.15]** ApolloMoE 代码仓库正式发布！🎉 ## 语言覆盖范围 12种主流语言与38种小众语言 <details> <summary>点击查看语言覆盖详情</summary> ![ApolloMoE](assets/languages.png) </details> ## 模型架构 <details> <summary>点击查看MoE路由示意图</summary> ![ApolloMoE](assets/hybrid_routing.png) </details> ## 实验结果 #### 稠密模型 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-0.5B" target="_blank">Apollo2-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-1.5B" target="_blank">Apollo2-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-2B" target="_blank">Apollo2-2B</a> 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-3.8B" target="_blank">Apollo2-3.8B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-7B" target="_blank">Apollo2-7B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-9B" target="_blank">Apollo2-9B</a> <details> <summary>点击查看稠密模型实验结果</summary> ![ApolloMoE](assets/dense_results.png) </details> #### 后混合专家模型（Post-MoE） 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-0.5B" target="_blank">Apollo-MoE-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-1.5B" target="_blank">Apollo-MoE-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-7B" target="_blank">Apollo-MoE-7B</a> <details> <summary>点击查看Post-MoE模型实验结果</summary> ![ApolloMoE](assets/post_moe_results.png) </details> ## 输入格式 ##### Apollo2系列模型 - 0.5B、1.5B、7B版本：User:{query} Assistant:{response}<|endoftext|> - 2B、9B版本：User:{query} Assistant:{response}<eos> - 3.8B版本：<|user|> {query}<|end|><|assisitant|> {response}<|end|> ##### Apollo-MoE系列模型 - 0.5B、1.5B、7B版本：User:{query} Assistant:{response}<|endoftext|> ## 数据集与评估基准 - 数据集 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> <details><summary>点击展开详情</summary> ![ApolloMoE](assets/Dataset.png) - [数据分类](https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus/tree/main/train) </details> - 评估基准 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a> <details><summary>点击展开详情</summary> - 英语（EN）: - [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options) - [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test) - [PubMedQA](https://huggingface.co/datasets/pubmed_qa): 因结果波动较大，本文未采用该数据集 - [MMLU-Medical](https://huggingface.co/datasets/cais/mmlu) - 临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学 - 中文（ZH）: - [MedQA-MCMLE](https://huggingface.co/datasets/bigbio/med_qa/viewer/med_qa_zh_4options_bigbio_qa/test) - [CMB-single](https://huggingface.co/datasets/FreedomIntelligence/CMB): 本文未采用该数据集 - 随机抽取2000道单项选择题 - [CMMLU-Medical](https://huggingface.co/datasets/haonan-li/cmmlu) - 解剖学、临床知识、大学医学、遗传学、营养学、中医、病毒学 - [CExam](https://github.com/williamliujl/CMExam): 本文未采用该数据集 - 随机抽取2000道多项选择题 - 西班牙语（ES）: [Head_qa](https://huggingface.co/datasets/head_qa) - 法语（FR): - [Frenchmedmcqa](https://github.com/qanastek/FrenchMedMCQA) - [MMLU_FR] - 临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学 - 印地语（HI）: [MMLU_HI](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Hindi) - 临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学 - 阿拉伯语（AR）: [MMLU_AR](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Arabic) - 临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学 - 日语（JA）: [IgakuQA](https://github.com/jungokasai/IgakuQA) - 韩语（KO）: [KorMedMCQA](https://huggingface.co/datasets/sean0042/KorMedMCQA) - 意大利语（IT）: - [MedExpQA](https://huggingface.co/datasets/HiTZ/MedExpQA) - [MMLU_IT] - 临床知识、医学遗传学、解剖学、专业医学、大学生物学、大学医学 - 德语（DE）: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): 德语子集 - 葡萄牙语（PT）: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): 葡萄牙语子集 - 俄语（RU）: [RuMedBench](https://github.com/sb-ai-lab/MedBench) - 小众语言: MMLU医学翻译子集 </details> ## 结果复现 <details><summary>点击展开详情</summary> 本文以Apollo2-7B或Apollo-MoE-0.5B为例，介绍复现流程： 1. 下载项目所需数据集： bash 0.download_data.sh 2. 为特定模型准备测试与开发集数据： - 生成带特殊标记的测试数据 bash 1.data_process_test&dev.sh 3. 为特定模型准备训练集数据（提前生成分词后的训练数据）： - 您可在此步骤调整训练数据顺序与训练轮次 bash 2.data_process_train.sh 4. 训练模型 - 若需多节点训练，请参考 ./src/sft/training_config/zero_multi.yaml bash 3.single_node_train.sh 5. 评估模型：生成基准测试得分 bash 4.eval.sh </details> ## 引用格式若您将本数据集用于训练或评估，请引用以下文献： @misc{zheng2024efficientlydemocratizingmedicalllms, title={Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts}, author={Guorui Zheng and Xidong Wang and Juhao Liang and Nuo Chen and Yuping Zheng and Benyou Wang}, year={2024}, eprint={2410.10626}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2410.10626}, }

提供机构：

maas

创建时间：

2025-01-20

5,000+

优质数据集

54 个

任务类型

进入经典数据集