TheLumos/WHB_subset
收藏Hugging Face2025-12-17 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/TheLumos/WHB_subset
下载链接
链接失效反馈官方服务:
资源简介:
WHB子集是一个精选的评估数据集,用于评估大型语言模型在女性健康相关任务中的表现。数据集包含模型桩(model stumps)和专家撰写的错误解释。这些模型桩代表了从患者面对的场景、临床决策背景和基于证据的医学问题中提取的现实提示。每个提示都配有专家解释,说明发现的问题。该子集来源于Women’s Health Benchmark (WHB),包含20个代表性的模型桩和对应的专家解释。数据集是与女性健康专家(包括临床医生、药剂师和研究人员)合作开发的。
A curated evaluation dataset for assessing large language models (LLMs) in women’s health–related tasks. The dataset consists of model stumps paired with expert-written justifications describing observed errors. The model stumps represent realistic prompts drawn from patient-facing scenarios, clinician decision-making contexts, and evidence-based medical questions. Each prompt is paired with an expert justification explaining the identified issue. This subset is derived from the Women’s Health Benchmark (WHB) and contains 20 representative model stumps with corresponding expert justifications. The dataset was developed in collaboration with women’s health experts, including clinicians, pharmacists, and researchers.
提供机构:
TheLumos



