five

Natural language processing systems for pathology parsing in limited data environments with uncertainty estimation

收藏
DataONE2020-07-27 更新2025-06-21 收录
下载链接:
https://search.dataone.org/view/sha256:de9244c85ba6dd3c7d240cdaf1c7b758a76c4c3c4ff5a7bb22a7a988463be7f0
下载链接
链接失效反馈
官方服务:
资源简介:
Objective: Cancer is a leading cause of death, but much of the diagnostic information is stored as unstructured data in pathology reports. We aim to improve uncertainty estimates of machine-learning based pathology parsers and evaluate performance in low data settings. Materials and Methods: Our data comes from the Urologic Outcomes Database at UCSF which includes 3,232 annotated prostate cancer pathology reports from 2001-2018. We approach 17 separate information extraction tasks, involving a wide range of pathologic features. To handle the diverse range of fields we required two statistical models, a document classification method for pathologic features with a small set of possible values and a token extraction method for pathologic features with a large set of values. For each model, we used isotonic calibration to improve the model’s estimates of its likelihood of being correct. Results: Our best document classifier method, a convolutional neural network, achieves a weighted ...
创建时间:
2025-06-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作