Code for the HIVE Appendicitis prediction modelRepository with LLM_data_extractor_optuna for automated feature extraction

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://figshare.com/articles/dataset/Code_for_developing_and_validating_HIVE_model_for_Appendicitis_Prediction_XGBoost_/28931030

下载链接

链接失效反馈

官方服务：

资源简介：

This repository contains the code and trained model for developing the HIVE (History, Intake, Vitals, Examination) machine learning model for predicting appendicitis in patients presenting with acute abdominal pain at the Emergency Department (ED). The HIVE model was developed using structured data derived from ED intake forms, vital signs, and clinical signs and symptoms extracted from free-text ED reports. These clinical features were either manually annotated or automatically extracted using a large language model (LLM), depending on the experiment. The codebase includes: Preprocessing scripts for merging structured and unstructured inputsFeature engineering and selection stepsModel development using XGBoostHyperparameter tuning via OptunaEvaluation procedures including AUROC calculation and bootstrapping for confidence intervalsThe repository also includes a pickled version of the final trained model (developed on 268 training cases), which can be used to generate appendicitis risk predictions on new patient data when provided with the appropriate input features. LLM Data Extractor optuna repo is a Python framework for generating and evaluating clinical text predictions using large language models (LLMs) like qwen2.5. It supports: Prediction task execution via a local Ollama serverHyperparameter tuning with Optuna across temperature, top-k, top-p, and min-pDesigned to extract structured outputs from unstructured text data (e.g., ED reports)Fully configurable via CLI for automated or fine-tuned runs.Please read the attached readme.md for a brief walkthrough how to use the LLM JSON pipeline (with examples). The pipeline supports both direct prediction generation and structured evaluation with minimal setup. It is based on the llm_extractinator/README.md at main · DIAGNijmegen/llm_extractinator · GitHub. This repository is linked to the paper Large Language Model Automated Extraction of Clinical Signs and Symptoms From Emergency Department Reports for Machine Learning Prediction Models: A Development and Validation Study.

创建时间：

2025-05-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集