Modeling data set and feature set.
收藏Figshare2024-04-17 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Modeling_data_set_and_feature_set_/25630420
下载链接
链接失效反馈官方服务:
资源简介:
PurposeTo explore the feasibility and validity of machine learning models in determining causality in medical malpractice cases and to try to increase the scientificity and reliability of identification opinions.MethodsWe collected 13,245 written judgments from PKULAW.COM, a public database. 963 cases were included after the initial screening. 21 medical and ten patient factors were selected as characteristic variables by summarising previous literature and cases. Random Forest, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) were used to establish prediction models of causality for the two data sets, respectively. Finally, the optimal model is obtained by hyperparameter tuning of the six models.ResultsWe built three real data set models and three virtual data set models by three algorithms, and their confusion matrices differed. XGBoost performed best in the real data set, with a model accuracy of 66%. In the virtual data set, the performance of XGBoost and LightGBM was basically the same, and the model accuracy rate was 80%. The overall accuracy of external verification was 72.7%.ConclusionsThe optimal model of this study is expected to predict the causality accurately.
创建时间:
2024-04-17



