five

PMC-Patients Meta Data

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/PMC-Patients_Meta_Data/24512725
下载链接
链接失效反馈
官方服务:
资源简介:
## PMC-Patients Meta Data Meta data for PMC-Patients that might facilitate reproduction or usage of our dataset, consisting of the following files (most of which can be derived from our main files above). ### PMIDs.json PMIDs of articles from which PMC-Patients are extracted. List of string, length 140,897. ### train_PMIDs.json & dev_PMIDs.json & test_PMIDs.json & human_PMIDs.json PMIDs of articles in training / dev / test split. List of string. ### train_patient_uids.json & dev_patient_uids.json & test_patient_uids.json & human_patient_uids.json Patient_uids of notes in training / dev / test split. List of string. ### patient2article_relevance.json Full patient-to-article dataset. A dict where the keys are `patient_uid` of queries and each entry is a list of `PMID`, representing articles relevant to the query. The 3-point relevance can be obtained by checking whether the `PMID` is in `PMIDs.json`. ### patient2patient_similarity.json Full patient-to-patient similarity dataset. A dict where the keys are `patient_uid` of queries and each entry is a list of `patient_uid`, representing similar patients to the query. The 3-point similarity can be obtained by checking whether the similar patient share the `PMID` (the string before '-' in `patient_uid`) with the query patient. ### PMID2Mesh.json Dict of PMIDs to MeSH terms of the article. ### MeSH_Humans_patient_uids.json `patient_uid` of the patients in PMC-Patients-Humans (extracted from articles with "Humans" MeSH term). List of string. ### PMC-Patients_citations.json Citations for all articles we used to collect our dataset. A dict where the keys are `patient_uid` and each entry is the citation of the source article. ### human_PMIDs.json PMIDs of the 500 randomly sampled articles for human evaluation. List of string. ### PMC-Patients_human_eval.json Expert annotation results of the 500 articles in `human_PMIDs.json`, including manually annotated patient note, demographics, and relations of the top 5 retrieved articles / patients. List of dict, and the keys are almost identical to `PMC-Patients.json`, with the exception of `human_patient_id` and `human_patient_uid`. The relational annotations are different from automatic ones. They are strings indicating on which dimension(s) are the patient-article / patient-patient pair relevant / similar. "0", "1", "2", and "3" represent "Irrelevant", "Diagnosis", "Test", "Treatment" in ReCDS-PAR, and represent "Dissimilar", "Features", "Outcomes", "Exposure" in ReCDS-PPR. Note that a pair can be relevant / similar on multiple dimensions at the same time. ### PAR_PMIDs.json PMIDs of the 11.7M articles used as PAR corpus. List of string.
创建时间:
2023-11-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作