Datasheet1_Feature-based clustering of the left ventricular strain curve for cardiovascular risk stratification in the general population.docx
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Datasheet1_Feature-based_clustering_of_the_left_ventricular_strain_curve_for_cardiovascular_risk_stratification_in_the_general_population_docx/24669687
下载链接
链接失效反馈官方服务:
资源简介:
ObjectiveIdentifying individuals with subclinical cardiovascular (CV) disease could improve monitoring and risk stratification. While peak left ventricular (LV) systolic strain has emerged as a strong prognostic factor, few studies have analyzed the whole temporal profiles of the deformation curves during the complete cardiac cycle. Therefore, in this longitudinal study, we applied an unsupervised machine learning approach based on time-series-derived features from the LV strain curve to identify distinct strain phenogroups that might be related to the risk of adverse cardiovascular events in the general population.
MethodWe prospectively studied 1,185 community-dwelling individuals (mean age, 53.2 years; 51.3% women), in whom we acquired clinical and echocardiographic data including LV strain traces at baseline and collected adverse events on average 9.1 years later. A Gaussian Mixture Model (GMM) was applied to features derived from LV strain curves, including the slopes during systole, early and late diastole, peak strain, and the duration and height of diastasis. We evaluated the performance of the model using the clinical characteristics of the participants and the incidence of adverse events in the training dataset. To ascertain the validity of the trained model, we used an additional community-based cohort (n = 545) as external validation cohort.
ResultsThe most appropriate number of clusters to separate the LV strain curves was four. In clusters 1 and 2, we observed differences in age and heart rate distributions, but they had similarly low prevalence of CV risk factors. Cluster 4 had the worst combination of CV risk factors, and a higher prevalence of LV hypertrophy and diastolic dysfunction than in other clusters. In cluster 3, the reported values were in between those of strain clusters 2 and 4. Adjusting for traditional covariables, we observed that clusters 3 and 4 had a significantly higher risk for CV (28% and 20%, P ≤ 0.038) and cardiac (57% and 43%, P ≤ 0.024) adverse events. Using SHAP values we observed that the features that incorporate temporal information, such as the slope during systole and early diastole, had a higher impact on the model's decision than peak LV systolic strain.
ConclusionEmploying a GMM on features derived from the raw LV strain curves, we extracted clinically significant phenogroups which could provide additive prognostic information over the peak LV strain.
研究目的:识别亚临床心血管(cardiovascular, CV)疾病个体,可优化临床监测流程并完善风险分层策略。尽管左心室(left ventricular, LV)收缩期峰值应变已成为极具价值的预后因子,但目前鲜有研究对完整心动周期内的心室形变曲线开展全时域特征分析。为此,本项纵向研究基于左心室应变曲线的时序衍生特征,采用无监督机器学习方法,旨在挖掘与普通人群不良心血管事件风险相关的特征性应变表型簇。
研究方法:本研究前瞻性纳入1185名社区常住居民(平均年龄53.2岁,女性占比51.3%),于基线阶段收集其临床与超声心动图数据,包括左心室应变轨迹,并平均随访9.1年后统计不良心血管事件发生情况。本研究针对左心室应变曲线衍生的特征——包括收缩期、舒张早期与舒张晚期的斜率、峰值应变,以及舒张期平台的持续时间与高度——采用高斯混合模型(Gaussian Mixture Model, GMM)进行分析。在训练数据集内,我们通过受试者临床特征与不良事件发生率评估模型性能;为验证训练后模型的有效性,额外纳入545名社区居民组成的队列作为外部验证队列。
研究结果:将左心室应变曲线进行聚类的最优簇数为4个。簇1与簇2在年龄与心率分布上存在差异,但二者的心血管危险因素患病率均处于较低水平。簇4的心血管危险因素聚集情况最为严重,左心室肥厚与舒张功能障碍的患病率均高于其余簇。簇3的各项指标则介于簇2与簇4之间。校正传统混杂变量后,簇3与簇4的心血管不良事件风险分别升高28%与20%(P≤0.038),心脏不良事件风险分别升高57%与43%(P≤0.024)。通过SHAP值分析发现,包含时域信息的特征(如收缩期与舒张早期斜率)对模型决策的影响程度高于左心室收缩期峰值应变。
研究结论:本研究通过对原始左心室应变曲线的衍生特征应用高斯混合模型,挖掘出具有临床意义的特征性应变表型簇,相较于左心室收缩期峰值应变,该簇可提供额外的预后信息。
创建时间:
2023-11-30



