five

Source code and data for the PhD Thesis "On-Premise Medical Information Extraction from German Doctor’s Letters under Clinical Constraints"

收藏
DataCite Commons2026-04-21 更新2026-05-07 收录
下载链接:
https://heidata.uni-heidelberg.de/citation?persistentId=doi:10.11588/DATA/USQLMB
下载链接
链接失效反馈
官方服务:
资源简介:
<h2>Dataset overview</h2> <p> This dataset contains source code and annotation guidelines used in the PhD thesis: </p> <p> “On-Premise Medical Information Extraction from German Doctor’s Letters under Clinical Constraints” </p> <h3>Repository structure</h3> <p>The dataset is split into five repositories:</p> <ul> <li>Source code for Chapter 2.6 <em>De-identification of German doctor’s letters</em></li> <li>Source code for Chapter 5 <em>Clinical Section Classification using Pretrained Language Models and Prompting</em></li> <li>Source code for Chapter 6 <em>Medication Information Extraction using Local Large Language Models</em></li> <li>Source code for Chapter 7<em>Clinical Application: Medication Trends and Polypharmacy</em></li> <li>Annotation guidelines for Chapters 2.6, 4, 5, and 7</li> </ul> <h3>CARDIO:DE</h3> <p> The main dataset used for experiments in Chapters 5, 6, and 7: </p> <ul> <li> CARDIO:DE - <a href="https://doi.org/10.11588/DATA/AFYQDY">https://doi.org/10.11588/DATA/AFYQDY</a> </li> </ul> <h3>Additional datasets (not included here)</h3> <p>Other datasets used include:</p> <ul> <li> n2c2 2018 Track 2 (used in Chapter 6) - <a href="https://doi.org/10.1093/jamia/ocz166">https://doi.org/10.1093/jamia/ocz166</a> </li> </ul> <h3>Notes on additional data and model availability</h3> <p> Doctor’s letters from the cardiology domain used in Chapters 2, 5, 6, and 7 (except for CARDIO:DE) and all further-pretrained and finetuned models cannot be distributed due to data protection regulations. </p>
提供机构:
heiDATA
创建时间:
2026-04-14
二维码
社区交流群
二维码
科研交流群
商业服务