five

AI4PROFHEALTH - Profession-health status knowledge graph

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14203755
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset comprises a profession-clinical knowledge graph, derived from the co-occurrence of normalised concepts identified in two distinct corpora: the Mesinesp2 corpus, a manually annotated corpus in which domain experts have labelled a set of scientific literature, clinical trials, and patent abstracts, as well as clinical case reports. The application of different NER systems to each corpus has enabled the extraction of clinical mentions related to diseases, drugs, locations, procedures, species, species-human, and symptoms. The repository contains a .zip file for each of the corpus, each containing the following columns order: span_mention_1: Mention string (original): profession normalized_entity_1: Controlled vocabulary entry for this term code_mention_1: ID terminology for normalization mention_controlled_vocab: Terminology used for normalization mention1_category: Semantic class (i.e., NER label) mention1_freq: Absolute frequency of this mention entity 1 span_mention_2: Mention string (original): entity 2 (disease, symptom, species, etc.) normalized_entity_2: Controlled vocabulary entry for this term code_mention_2: ID terminology for normalization mention_controlled_vocab: Terminology used for normalization mention2_category: Semantic class (i.e., NER label) mention1_freq: Absolute frequency of this mention entity 2 co-occurrence: Number of co-occurrences Notes This resource been funded by the Spanish National Proyectos I+D+i 2020 AI4ProfHealth project PID2020-119266RA-I00 (PID2020-119266RA-I0/AEI/10.13039/501100011033). Contact If you have any questions or suggestions, please contact us at: - Miguel Rodríguez Ortega ()- Martin Krallinger () Additional resources and corpora If you are interested, you might want to check out these corpora and resources: MEDDOPROF (Corpus of mentions of professions, occupations and working status and normalization, different document collection with some overlapping documents) MESINESP-2 (Corpus of manually indexed records with DeCS /MeSH terms comprising scientific literature abstracts, clinical trials, and patent abstracts, different document collection)
创建时间:
2024-12-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作