Layered-Labs/benchbase-medmcqa

Name: Layered-Labs/benchbase-medmcqa
Creator: Layered-Labs
Published: 2026-04-05 22:26:28
License: 暂无描述

Hugging Face2026-04-05 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/Layered-Labs/benchbase-medmcqa

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: mit pretty_name: "BenchBase: MedMCQA" tags: - medical - question-answering - multiple-choice - benchmark - clinical-knowledge - medical-education - benchbase task_categories: - question-answering - multiple-choice size_categories: - 100K<n<1M --- <h1 align="center">BenchBase: MedMCQA</h1> <h3 align="center">193,000+ multiple-choice medical questions from Indian postgraduate medical entrance exams, spanning 2,400 healthcare topics.</h3> <p align="center"> <a href="https://huggingface.co/datasets/Layered-Labs/benchbase-medmcqa"> <img src="https://huggingface.co/datasets/huggingface/badges/resolve/main/dataset-on-hf-md.svg" alt="Dataset on HuggingFace"/> </a>   <img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="License"/> </p> <p align="center"> <img src="https://huggingface.co/datasets/Layered-Labs/benchbase-medmcqa/resolve/main/banner.svg" alt="BenchBase: MedMCQA" width="100%"/> </p> --- ## Overview BenchBase: MedMCQA is a repackaging of the MedMCQA dataset for the Layered Labs BenchBase benchmark suite. The source dataset contains 193,155 multiple-choice questions drawn from AIIMS and PGMR medical entrance examinations, covering 2,400+ healthcare topics across clinical medicine, pharmacology, pathology, anatomy, biochemistry, and more. Each question includes four answer options, a correct answer label, subject and topic annotations, and optional expert explanations. The train split (182,822 questions) includes correct answer labels; the test split (6,150 questions) withholds them for evaluation. This repackaging standardizes column names and encoding to match the BenchBase schema. --- ## Statement of Need Evaluating medical language models requires standardized, clinically relevant benchmarks. MedMCQA is one of the largest and most topic-diverse medical QA datasets available, but the original release uses inconsistent column names and encoding that complicate integration into multi-benchmark evaluation pipelines. This BenchBase repackaging normalizes the schema, documents all fields explicitly, and makes the dataset drop-in compatible with other BenchBase medical benchmarks for unified evaluation. --- ## Intended Use This dataset is intended for researchers and engineers evaluating or fine-tuning language models on medical question answering. It is appropriate for benchmarking general medical knowledge, subject-specific clinical reasoning, and model calibration across 2,400+ topics. It is not intended as a clinical decision support tool or as a substitute for professional medical judgment. --- ## Limitations Questions are drawn from Indian postgraduate medical entrance exams and may reflect regional clinical guidelines, drug names, and diagnostic conventions that differ from US or European practice. The test split withholds correct answer labels, so evaluation requires submission to the original leaderboard for held-out accuracy. Expert explanations are available for only a subset of training questions. Some questions have ambiguous or disputed correct answers as noted in the research literature. --- ## Dataset Structure ### Splits | Split | Rows | |-------|------| | train | 182,822 | | validation | 4,183 | | test | 6,150 | ### Features | Column | Type | Description | |--------|------|-------------| | `id` | `string` | Unique question identifier. | | `question` | `string` | The medical question text. | | `opa` | `string` | Answer option A. | | `opb` | `string` | Answer option B. | | `opc` | `string` | Answer option C. | | `opd` | `string` | Answer option D. | | `answer` | `int8` | Index of the correct option (0=A, 1=B, 2=C, 3=D). Null in test split. | | `explanation` | `string` | Expert explanation for the correct answer, where available. | | `subject` | `string` | Medical subject area (e.g., Pharmacology, Anatomy). | | `topic` | `string` | Specific topic within the subject. | | `choice_type` | `string` | Whether the question is single-answer or multi-answer. | --- ## Usage ```python from datasets import load_dataset ds = load_dataset("Layered-Labs/benchbase-medmcqa") print(ds) ``` ### Example ```python from datasets import load_dataset ds = load_dataset("Layered-Labs/benchbase-medmcqa") # Inspect the training split train = ds["train"] print(train[0]) # Filter to pharmacology questions pharma = train.filter(lambda x: x["subject"] == "Pharmacology") print(f"Pharmacology questions: {len(pharma)}") ``` --- ## Citation If you use this dataset in your research, please cite: ```bibtex @dataset{layeredlabs_benchbase_medmcqa, title = {BenchBase: MedMCQA}, author = {Ridwan, Abdullah and Hossain, Radhyyah}, year = {2025}, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/Layered-Labs/benchbase-medmcqa} } ``` --- ## License Released under the [MIT License](LICENSE). Maintained by [Layered Labs](https://layeredlabs.ai).

提供机构：

Layered-Labs

5,000+

优质数据集

54 个

任务类型

进入经典数据集