A Realistic Rwandan Communty Health Worker Generated Vingette-based (I.e., Open-ended Questions) Benchmarking Dataset (with Associated Clinician and LLM Responses)

NIAID Data Ecosystem2026-05-10 收录

下载链接：

https://figshare.com/articles/dataset/A_Realistic_Rwandan_Communty_Health_Worker_Generated_Vingette-based_I_e_Open-ended_Questions_Benchmarking_Dataset_with_Associated_Clinician_and_LLM_Responses_/29213147

下载链接

链接失效反馈

官方服务：

资源简介：

The full dataset of 5,609 clinical vignettes contributed by 101 community health workers (CHWs) across four Rwandan districts is available upon request from The Centre for the Fourth Industrial Revolution, the innovation lab for the Rwandan Government. Prospective users should contact ‘info@c4ir.rw’ to request access. This subset of 524 questions, answers, and individual evaluation results comprises the data used in a benchmarking study to compare responses generated by five large language models (LLMs) (Gemini-2, GPT-4o, o3 mini, Deepseek R1, and Meditron-70B) with those from local clinicians. The results were generated by applying a rubric of 11 expert-rated metrics, each scored on a five-point Likert scale. The associated benchmarking study, titled 'How Good Are Large Language Models at Supporting Frontline Healthcare Workers in Low-Resource Settings – A Benchmarking Study & Dataset,' contains detailed descriptions of how this data was generated and the evaluations conducted.

由卢旺达四个地区的101名社区卫生工作者（Community Health Workers, CHWs）提供的5609份临床情景案例完整数据集，可通过卢旺达政府旗下的创新实验室——第四次工业革命中心（The Centre for the Fourth Industrial Revolution）申请获取。有意向的使用者可联系邮箱info@c4ir.rw以申请访问权限。本数据集包含524道问题、答案及个体评估结果的子集，为一项基准测试研究所用的数据，该研究旨在对比5个大语言模型（Large Language Model, LLM）生成的答复与当地临床医生的答复。这5个大语言模型分别为Gemini-2、GPT-4o、o3 mini、Deepseek R1及Meditron-70B。评估结果通过采用由11项专家评级指标构成的评分框架生成，每项指标均以5级李克特量表（Likert scale）进行评分。本项配套基准测试研究的标题为《大语言模型在资源匮乏环境下能否支持一线医护人员？一项基准测试研究与数据集》，其中详细阐述了本数据集的生成方式及所开展的各项评估工作。

创建时间：

2026-02-07