AI4PROFHEALTH - Profession-health status knowledge graph
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14203755
下载链接
链接失效反馈官方服务:
资源简介:
This dataset comprises a profession-clinical knowledge graph, derived from the co-occurrence of normalised concepts identified in two distinct corpora: the Mesinesp2 corpus, a manually annotated corpus in which domain experts have labelled a set of scientific literature, clinical trials, and patent abstracts, as well as clinical case reports. The application of different NER systems to each corpus has enabled the extraction of clinical mentions related to diseases, drugs, locations, procedures, species, species-human, and symptoms.
The repository contains a .zip file for each of the corpus, each containing the following columns order:
span_mention_1: Mention string (original): profession
normalized_entity_1: Controlled vocabulary entry for this term
code_mention_1: ID terminology for normalization
mention_controlled_vocab: Terminology used for normalization
mention1_category: Semantic class (i.e., NER label)
mention1_freq: Absolute frequency of this mention entity 1
span_mention_2: Mention string (original): entity 2 (disease, symptom, species, etc.)
normalized_entity_2: Controlled vocabulary entry for this term
code_mention_2: ID terminology for normalization
mention_controlled_vocab: Terminology used for normalization
mention2_category: Semantic class (i.e., NER label)
mention1_freq: Absolute frequency of this mention entity 2
co-occurrence: Number of co-occurrences
Notes
This resource been funded by the Spanish National Proyectos I+D+i 2020 AI4ProfHealth project PID2020-119266RA-I00 (PID2020-119266RA-I0/AEI/10.13039/501100011033).
Contact
If you have any questions or suggestions, please contact us at:
- Miguel Rodríguez Ortega ()- Martin Krallinger ()
Additional resources and corpora
If you are interested, you might want to check out these corpora and resources:
MEDDOPROF (Corpus of mentions of professions, occupations and working status and normalization, different document collection with some overlapping documents)
MESINESP-2 (Corpus of manually indexed records with DeCS /MeSH terms comprising scientific literature abstracts, clinical trials, and patent abstracts, different document collection)
创建时间:
2024-12-02



