mkurman/medmcqa-hard
收藏Hugging Face2025-10-05 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/mkurman/medmcqa-hard
下载链接
链接失效反馈官方服务:
资源简介:
MedMCQA-Hard是一个更难、去重版本的MedMCQA,旨在减少记忆依赖并加强医学多项选择题的泛化能力。每个正确选项以多种表述/列表变体出现,使得模型不能依赖于表面形式的回忆,而必须对内容进行推理。每个项目包含一个标准正确答案和单一或一组错误答案,便于DPO、RLAIF/GRPO和对比目标的使用。此外,数据集还加入了轻量级的消息格式,方便指令调整模型和SFT的使用。
MedMCQA-Hard is a harder, de-duplicated version of MedMCQA designed to reduce memorization and enhance generalization of medical multiple-choice questions. Each correct option appears in multiple phrasing/list variants, preventing models from relying on surface-form recall and necessitating reasoning over content. Every item includes one canonical correct answer and both a single and a set of incorrect answers, making it suitable for DPO, RLAIF/GRPO, and contrastive objectives. The dataset also features lightweight message formatting, convenient for instruction-tuned models and SFT.
提供机构:
mkurman



