Data from: Building the graph of medicine from millions of clinical narratives
收藏DataCite Commons2025-05-01 更新2025-04-09 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.jp917
下载链接
链接失效反馈官方服务:
资源简介:
Electronic health records (EHR) represent a rich and relatively untapped
resource for characterizing the true nature of clinical practice and for
quantifying the degree of inter-relatedness of medical entities such as
drugs, diseases, procedures and devices. We provide a unique set of
co-occurrence matrices, quantifying the pairwise mentions of 3 million
terms mapped onto 1 million clinical concepts, calculated from the raw
text of 20 million clinical notes spanning 19 years of data.
Co-frequencies were computed by means of a parallelized annotation,
hashing, and counting pipeline that was applied over clinical notes from
Stanford Hospitals and Clinics. The co-occurrence matrix quantifies the
relatedness among medical concepts which can serve as the basis for many
statistical tests, and can be used to directly compute Bayesian
conditional probabilities, association rules, as well as a range of test
statistics such as relative risks and odds ratios. This dataset can be
leveraged to quantitatively assess comorbidity, drug-drug, and
drug-disease patterns for a range of clinical, epidemiological, and
financial applications.
提供机构:
Dryad
创建时间:
2014-08-28



