Simulation of undiagnosed patients with novel genetic conditions
收藏DataONE2023-04-10 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:5a6c8ca41b1112691c55c0a0eac206c58fffdb2187687f8f244333aad918f425
下载链接
链接失效反馈官方服务:
资源简介:
We present a computational pipeline to simulate realistic undiagnosed rare disease patients that can be used to evaluate gene prioritization tools. Each simulated patient is represented by sets of candidate disease-causing genes and standardized phenotype terms. Pipeline Features Realistic simulation of patient phenotypes and candidate genes: We provide a taxonomy of categories of “distractor” genes that do not cause the patient’s presenting syndrome yet would be considered plausible candidates during the clinical process. We then introduce a simulation framework that jointly samples genes and phenotypes according to these categories to simulate nontrivial and realistic patients. Modelling novel genetic conditions: To simulate patients with novel genetic conditions, we curate a knowledge graph (KG) of known gene–disease and gene–phenotype annotations that is time-stamped to 2015. This enables us to define post-2015 medical genetics discoveries as novel with respect to our KG. We manually time-stamp each disease and disease–gene association according to the date of the Pubmed article that reported the discovery and use these time-stamps to annotate each patient according to each of the disease-gene novelty categories below. This dataset houses (1) the simulated patients and (2) data needed to run the computational pipeline for any user to generate their own simulated patient cohort. The accompanying github repository can be found at: https://github.com/EmilyAlsentzer/rare-disease-simulation.
创建时间:
2023-11-08



