Real and synthetic marker profiles.
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/Real_and_synthetic_marker_profiles_/22157061
下载链接
链接失效反馈官方服务:
资源简介:
Parkinson’s disease (PD) is characterized by a long prodromal phase with a multitude of markers indicating an increased PD risk prior to clinical diagnosis based on motor symptoms. Current PD prediction models do not consider interdependencies of single predictors, lack differentiation by subtypes of prodromal PD, and may be limited and potentially biased by confounding factors, unspecific assessment methods and restricted access to comprehensive marker data of prospective cohorts. We used prospective data of 18 established risk and prodromal markers of PD in 1178 healthy, PD-free individuals and 24 incident PD cases collected longitudinally in the Tübingen evaluation of Risk factors for Early detection of NeuroDegeneration (TREND) study at 4 visits over up to 10 years. We employed artificial intelligence (AI) to learn and quantify PD marker interdependencies via a Bayesian network (BN) with probabilistic confidence estimation using bootstrapping. The BN was employed to generate a synthetic cohort and individual marker profiles. Robust interdependencies were observed for BN edges from age to subthreshold parkinsonism and urinary dysfunction, sex to substantia nigra hyperechogenicity, depression, non-smoking and to constipation; depression to symptomatic hypotension and excessive daytime somnolence; solvent exposure to cognitive deficits and to physical inactivity; and non-smoking to physical inactivity. Conversion to PD was interdependent with prior subthreshold parkinsonism, sex and substantia nigra hyperechogenicity. Several additional interdependencies with lower probabilistic confidence were identified. Synthetic subjects generated via the BN based representation of the TREND study were realistic as assessed through multiple comparison approaches of real and synthetic data. Altogether our work demonstrates the potential of modern AI approaches (specifically BNs) both for modelling and understanding interdependencies between PD risk and prodromal markers, which are so far not accounted for in PD prediction models, as well as for generating realistic synthetic data.
帕金森病(Parkinson’s disease, PD)以较长的前驱期为特征,存在多种标志物提示在基于运动症状的临床确诊前,帕金森病风险已升高。当前的帕金森病预测模型未考虑单个预测因子间的相互关联,缺乏对前驱期帕金森病亚型的区分,且可能因混杂因素、非特异性评估方法,以及难以获取前瞻性队列的全面标志物数据而存在局限性与潜在偏倚。本研究使用了来自蒂宾根神经退行性疾病早期检测风险因素评估(Tübingen evaluation of Risk factors for Early detection of NeuroDegeneration, TREND)研究的前瞻性数据,该研究在长达10年的时间内开展了4次随访,共纳入1178名健康且未患帕金森病的受试者与24例新发帕金森病病例,涵盖18种已证实的帕金森病风险及前驱期标志物。本研究采用人工智能(Artificial Intelligence, AI)方法,通过贝叶斯网络(Bayesian network, BN)并结合自助法(bootstrapping)进行概率置信度估计,以学习并量化帕金森病标志物间的相互关联。研究利用该贝叶斯网络生成了合成队列与个体标志物特征谱。研究观察到贝叶斯网络边中存在诸多稳健的相互关联:从年龄指向亚阈值帕金森症与排尿功能障碍、从性别指向黑质高回声、抑郁、不吸烟与便秘、从抑郁指向症状性低血压与日间过度嗜睡、从溶剂暴露指向认知功能缺损与体力活动不足,以及从不吸烟指向体力活动不足。进展为帕金森病与此前存在的亚阈值帕金森症、性别以及黑质高回声存在相互关联。此外还识别出数条概率置信度较低的额外相互关联。通过基于贝叶斯网络的TREND研究表征生成的合成受试者,经真实数据与合成数据的多维度对比评估,被证实具备现实合理性。综上,本研究证实了现代人工智能方法(尤其是贝叶斯网络)的应用潜力:既可用于建模并解析帕金森病风险因子与前驱期标志物间此前未被现有预测模型纳入考量的相互关联,也可用于生成具备现实合理性的合成数据。
创建时间:
2023-02-24



