nsk7153/MedCalc-Bench

Name: nsk7153/MedCalc-Bench
Creator: nsk7153
Published: 2024-06-14 22:47:58
License: 暂无描述

Hugging Face2024-06-14 更新2024-06-22 收录

下载链接：

https://hf-mirror.com/datasets/nsk7153/MedCalc-Bench

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-sa-4.0 dataset_info: features: - name: Row Number dtype: int64 - name: Calculator ID dtype: int64 - name: Calculator Name dtype: string - name: Category dtype: string - name: Output Type dtype: string - name: Note ID dtype: string - name: Note Type dtype: string - name: Patient Note dtype: string - name: Question dtype: string - name: Relevant Entities dtype: string - name: Ground Truth Answer dtype: string - name: Lower Limit dtype: string - name: Upper Limit dtype: string - name: Ground Truth Explanation dtype: string splits: - name: train num_bytes: 41265307 num_examples: 10053 - name: test num_bytes: 4038342 num_examples: 1047 download_size: 19626866 dataset_size: 45303649 configs: - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* --- MedCalc-Bench is the first medical calculation dataset used to benchmark LLMs ability to serve as clinical calculators. Each instance in the dataset consists of a patient note, a question asking to compute a specific clinical value, an final answer value, and a step-by-step solution explaining how the final answer was obtained. Our dataset covers 55 different calculation tasks. We hope this dataset serves as a call to improve the verbal and computational reasoning skills of LLMs in medical settings. This dataset contains a training dataset of 10.1k instances and a testing dataset of 1047 instances. ## Contents inside the Training and Testing CSV In total, there are 1047 instances. Each row in the dataset contains the following information Calculator Name, Note ID, Note Type (Extracted (if note is from Open-Patients), Synthetic (if note is handwritten by clinician), or Template (if note is generated using a template)), Relevant Entities needed for calculation, Ground Truth Answer, Lower Limit, Upper Limit, Ground Truth Explanation. Note that for equation-based calculators whose output is a decimal, the values for the Upper and Lower Limit will be +/- 0.05% of the ground truth answer. If the LLM's final answer value is between the upper and lower limit, the answer is considered correct. For all other instances, the Upper Limit and Lower Limit are set to the same value as the ground truth. We make error accomodation for equation-based LLM calculators to accomodate for any rounding differences in intermediate steps. This issue is nonexistant for date-based equation calculators and rule-based calculators where final value is independent any rounding done in the intermediate steps. ## How to Use MedCalc-Bench The training dataset of MedCalc-Bench can be used for fine-tunining LLMs. We have provided both the fine-tuned models and code for fine-tuning at our repository link: https://github.com/ncbi-nlp/MedCalc-Bench. The test set of MedCalc-Bench is helpful for benchamrking LLMs under different settings. We provide instructions in the README of our repository for how to reproduce all of our results for all of the models using the different prompt settings. By experimenting with different LLMs and prompts, we hope our dataset demonstrates the potential and limitations of LLMs in different settings. ## License Both the training and testing dataset of MedCalc-Bench are released under the CC-BY-SA 4.0 license.

提供机构：

nsk7153

原始信息汇总

数据集概述

数据集信息

特征列表：
- Row Number: 数据类型为 int64
- Calculator ID: 数据类型为 int64
- Calculator Name: 数据类型为 string
- Category: 数据类型为 string
- Output Type: 数据类型为 string
- Note ID: 数据类型为 string
- Note Type: 数据类型为 string
- Patient Note: 数据类型为 string
- Question: 数据类型为 string
- Relevant Entities: 数据类型为 string
- Ground Truth Answer: 数据类型为 string
- Lower Limit: 数据类型为 string
- Upper Limit: 数据类型为 string
- Ground Truth Explanation: 数据类型为 string
数据分割：
- train: 包含 10053 个样本，大小为 41265307 字节
- test: 包含 1047 个样本，大小为 4038342 字节
数据集大小：
- 下载大小：19626866 字节
- 数据集总大小：45303649 字节

数据集配置

默认配置：
- train 数据文件路径：data/train-*
- test 数据文件路径：data/test-*

数据集内容

训练和测试数据集：
- 训练集包含 10100 个实例
- 测试集包含 1047 个实例
- 每个实例包含以下信息：
  - Calculator Name
  - Note ID
  - Note Type（可能为 Extracted、Synthetic 或 Template）
  - Relevant Entities
  - Ground Truth Answer
  - Lower Limit
  - Upper Limit
  - Ground Truth Explanation
特殊说明：
- 对于输出为小数的基于方程的计算器，上下限为真实答案的 +/- 0.05%
- 对于其他实例，上下限与真实答案相同

数据集用途

训练数据集：用于微调大型语言模型（LLMs）
测试数据集：用于在不同设置下基准测试大型语言模型（LLMs）

许可证

数据集发布在 CC-BY-SA 4.0 许可证下

5,000+

优质数据集

54 个

任务类型

进入经典数据集