a brief explanation of the technical terms.
收藏Figshare2026-03-16 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_p_a_brief_explanation_of_the_technical_terms_p_/31756726
下载链接
链接失效反馈官方服务:
资源简介:
BackgroundRoutine healthcare data are increasingly stored in electronic health records (EHRs), presenting an exciting opportunity to leverage machine learning (ML) for detecting and predicting medical events. While medical experts are optimistic about expanding its applications, several caveats exist which are often overlooked. Many medical outcomes are categorical (e.g., a diagnosis is present or absent) with categories being considerably unequal in size, which might significantly impact the performance of ML algorithms. Detecting small subgroups in EHR data, so-called anomaly detection, is an emerging approach, yet organized documentation on current practices remains scarce. This scoping review examines medical anomaly detection based on routine healthcare data stored in EHRs and formulated alternative approaches in case suboptimal practices were noticed.MethodsPubMed and Web of Science were searched up to September 5, 2024. Peer-reviewed articles and conference papers on ML-based medical anomaly detection in EHR data were included. Fifty-two study characteristics were extracted and analyzed both quantitatively and qualitatively.ResultsA total of 117 studies met the inclusion criteria. The cross-study median proportion of the anomalous class was 0.079 (range 0.00045–0.23). Key details, e.g., data preprocessing actions, were often incomplete; 14.5% (n = 17) provided no information on this aspect. Only four studies reported the underlying cause of missingness before deciding how to handle it, and just three considered the clinical implications of false positives and false negatives when evaluating anomaly detection performance.ConclusionWe identified a need for greater attention in the current medical anomaly detection literature for reporting details on pre-processing, handling of missing data, and the use of performance metrics. With the increasing number of anomaly detection studies based on routine healthcare data stored in EHRs, more focus is needed on implementation and reporting practices to ensure relevance and reproducibility of future studies in this field.
创建时间:
2026-03-16



