five

nsk7153/MedCalc-Bench-Verified

收藏
Hugging Face2025-12-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/nsk7153/MedCalc-Bench-Verified
下载链接
链接失效反馈
官方服务:
资源简介:
MedCalc-Bench Verified是一个重新验证的版本,用于评估大型语言模型(LLMs)作为临床计算器的能力。数据集中的每个实例包含一个患者笔记、一个要求计算特定临床值的问题、一个最终答案值以及一个逐步解决方案,解释如何获得最终答案。我们的数据集涵盖了55种不同的计算任务,这些任务要么是基于规则的计算,要么是基于方程的计算。该数据集包含10,538个训练实例和1,100个测试实例。我们希望我们的数据集和基准能够作为一个呼吁,提高LLMs在医疗环境中的计算推理能力。

MedCalc-Bench Verified is a re-verified version of MedCalc-Bench used to benchmark LLMs ability to serve as clinical calculators. Each instance in the dataset consists of a patient note, a question asking to compute a specific clinical value, a final answer value, and a step-by-step solution explaining how the final answer was obtained. Our dataset covers 55 different calculation tasks which are either rule-based calculations or are equation-based calculations. This dataset contains a training dataset of 10,538 instances and a testing dataset of 1,100 instances. In all, we hope that our dataset and benchmark serves as a call to improve the computational reasoning skills of LLMs in medical settings.
提供机构:
nsk7153
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作