MIMIC-Ext-DrugDetection
收藏DataCite Commons2025-09-25 更新2026-05-04 收录
下载链接:
https://physionet.org/content/mimic-ext-drug-detection/1.0.0/
下载链接
链接失效反馈官方服务:
资源简介:
This project shares a large, annotated drug detection dataset created from
MIMIC-III/IV discharge summaries. The dataset was developed to address the
challenge of identifying substance use behaviors in Electronic Health Records
(EHRs), where critical details are often embedded in unstructured notes
requiring contextual interpretation. The primary aim was to support future
systemic substance use surveillance. The data consists of medical notes
tokenized into sentences, annotated for eight substance use categories:
heroin, cocaine, methamphetamine, illicit use of prescription opioids and
benzodiazepines, cannabis, Injection Drug Use (IDU), and general drug use. The
dataset was used to evaluate the performance of various large language models
(LLMs) for detecting these substance use categories, demonstrating that LLMs,
particularly a fine-tuned model, can significantly enhance detection accuracy
and show promise for clinical decision support and research.
提供机构:
PhysioNet
创建时间:
2025-08-29



