five

PMC-Patients Meta Data

收藏
Figshare2023-11-06 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/PMC-Patients_Meta_Data/24512725/1
下载链接
链接失效反馈
官方服务:
资源简介:
## PMC-Patients Meta Data<br><br>Meta data for PMC-Patients that might facilitate reproduction or usage of our dataset, consisting of the following files (most of which can be derived from our main files above).<br><br>### PMIDs.json<br><br>PMIDs of articles from which PMC-Patients are extracted.<br>List of string, length 140,897.<br><br>### train_PMIDs.json &amp; dev_PMIDs.json &amp; test_PMIDs.json &amp; human_PMIDs.json<br><br>PMIDs of articles in training / dev / test split.<br>List of string.<br><br>### train_patient_uids.json &amp; dev_patient_uids.json &amp; test_patient_uids.json &amp; human_patient_uids.json<br><br>Patient_uids of notes in training / dev / test split.<br>List of string.<br><br>### patient2article_relevance.json<br><br>Full patient-to-article dataset.<br>A dict where the keys are `patient_uid` of queries and each entry is a list of `PMID`, representing articles relevant to the query.<br><br>The 3-point relevance can be obtained by checking whether the `PMID` is in `PMIDs.json`.<br><br>### patient2patient_similarity.json<br><br>Full patient-to-patient similarity dataset.<br>A dict where the keys are `patient_uid` of queries and each entry is a list of `patient_uid`, representing similar patients to the query.<br><br>The 3-point similarity can be obtained by checking whether the similar patient share the `PMID` (the string before '-' in `patient_uid`) with the query patient.<br><br><br>### PMID2Mesh.json<br><br>Dict of PMIDs to MeSH terms of the article.<br><br>### MeSH_Humans_patient_uids.json<br><br>`patient_uid` of the patients in PMC-Patients-Humans (extracted from articles with "Humans" MeSH term).<br>List of string.<br><br>### PMC-Patients_citations.json<br><br>Citations for all articles we used to collect our dataset.<br>A dict where the keys are `patient_uid` and each entry is the citation of the source article.<br><br>### human_PMIDs.json<br><br>PMIDs of the 500 randomly sampled articles for human evaluation.<br>List of string.<br><br>### PMC-Patients_human_eval.json<br><br>Expert annotation results of the 500 articles in `human_PMIDs.json`, including manually annotated patient note, demographics, and relations of the top 5 retrieved articles / patients.<br>List of dict, and the keys are almost identical to `PMC-Patients.json`, with the exception of `human_patient_id` and `human_patient_uid`.<br><br>The relational annotations are different from automatic ones. They are strings indicating on which dimension(s) are the patient-article / patient-patient pair relevant / similar. <br>"0", "1", "2", and "3" represent "Irrelevant", "Diagnosis", "Test", "Treatment" in ReCDS-PAR, and represent "Dissimilar", "Features", "Outcomes", "Exposure" in ReCDS-PPR.<br>Note that a pair can be relevant / similar on multiple dimensions at the same time.<br><br>### PAR_PMIDs.json<br><br>PMIDs of the 11.7M articles used as PAR corpus.<br>List of string.<br><br><br>
提供机构:
Zhao, Zhengyun
创建时间:
2023-11-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作