MedProcNER/ProcTEMIST Corpus: Gold Standard annotations for Clinical Procedures Information Extraction
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7817745
下载链接
链接失效反馈官方服务:
资源简介:
MedProcNER stands for MEDical PROCedure Named Entity Recognition. It is a shared task and set of resources focused on the detection, normalization and indexing of clinical procedures in medical documents in Spanish. MedProcNER is complementary to the DisTEMIST corpus (https://temu.bsc.es/distemist) as they both use the same document collection, which is why it's also called ProcTEMIST.
This repository includes the Train Set of the task, which includes a total of 750 documents. The unannotated test text files are also included so that predictions can be created for them. Finally, we include a gazetteer of possible SNOMED CT codes for the normalization and indexing tasks. For more information, please check the attached README file.
** UPDATE MAY 2nd 2023: Second part of the train set, test set texts and gazetteer now available!
** UPDATE MAY 12th 2023: We've uploaded a new version of the gazetteer that removes some ambiguous codes wrongfully added from older SNOMED versions
MedProcNER was developed by the Barcelona Supercomputing Center's NLP for Biomedical Information Analysis and used as part of BioASQ @ CLEF 2023. For more information on the corpus, annotation scheme and task in general, please visit: https://temu.bsc.es/medprocner.
Related Links:
- MedProcNER website: https://temu.bsc.es/medprocner
- MedProcNER Guidelines: https://doi.org/10.5281/zenodo.7817666
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Contact
If you have any questions or suggestions, please contact us at:
- Salvador Lima-López ()
- Martin Krallinger ()
创建时间:
2023-08-08



