ClinBench-HPB
收藏arXiv2025-09-30 收录
下载链接:
https://clinbench-hpb.github.io
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个临床评估基准,包含了3,535道封闭式多项选择题和337个开放式真实诊断病例,旨在评估大型语言模型在肝胆胰疾病领域的表现。该基准涵盖了按照ICD-10定义的所有33个主要类别和465个子类别的肝胆胰疾病。规模上,数据集包含了3,535个问题和337个病例,任务是对大型语言模型在医学问答和临床病例诊断方面的评估。
This dataset is a clinical evaluation benchmark comprising 3,535 closed-ended multiple-choice questions and 337 real-world open-ended diagnostic cases, which is designed to assess the performance of large language models in the field of hepatobiliary and pancreatic diseases. The benchmark covers all 33 major categories and 465 subcategories of hepatobiliary and pancreatic diseases as defined by ICD-10. With 3,535 questions and 337 cases in total, its core task is to evaluate large language models on medical question answering and clinical case diagnosis.



