nsk7153/MedCalc-Bench-Verified

Name: nsk7153/MedCalc-Bench-Verified
Creator: nsk7153
Published: 2025-12-19 04:03:51
License: 暂无描述

Hugging Face2025-12-19 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/nsk7153/MedCalc-Bench-Verified

下载链接

链接失效反馈

官方服务：

资源简介：

MedCalc-Bench Verified是一个重新验证的版本，用于评估大型语言模型（LLMs）作为临床计算器的能力。数据集中的每个实例包含一个患者笔记、一个要求计算特定临床值的问题、一个最终答案值以及一个逐步解决方案，解释如何获得最终答案。我们的数据集涵盖了55种不同的计算任务，这些任务要么是基于规则的计算，要么是基于方程的计算。该数据集包含10,538个训练实例和1,100个测试实例。我们希望我们的数据集和基准能够作为一个呼吁，提高LLMs在医疗环境中的计算推理能力。

MedCalc-Bench Verified is a re-verified version of MedCalc-Bench used to benchmark LLMs ability to serve as clinical calculators. Each instance in the dataset consists of a patient note, a question asking to compute a specific clinical value, a final answer value, and a step-by-step solution explaining how the final answer was obtained. Our dataset covers 55 different calculation tasks which are either rule-based calculations or are equation-based calculations. This dataset contains a training dataset of 10,538 instances and a testing dataset of 1,100 instances. In all, we hope that our dataset and benchmark serves as a call to improve the computational reasoning skills of LLMs in medical settings.

提供机构：

nsk7153

5,000+

优质数据集

54 个

任务类型

进入经典数据集