Survival regression with accelerated failure time model in XGBoost
收藏tandf.figshare.com2023-05-31 更新2025-03-22 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Survival_regression_with_accelerated_failure_time_model_in_XGBoost/19651403/1
下载链接
链接失效反馈官方服务:
资源简介:
Survival regression is used to estimate the relation between time-to-event and feature variables, and is important in application domains such as medicine, marketing, risk management and sales management. Nonlinear tree based machine learning algorithms as implemented in libraries such as XGBoost, scikit-learn, LightGBM, and CatBoost are often more accurate in practice than linear models. However, existing state-of-the-art implementations of tree-based models have offered limited support for survival regression. In this work, we implement loss functions for learning accelerated failure time (AFT) models in XGBoost, to increase the support for survival modeling for different kinds of label censoring. We demonstrate with real and simulated experiments the effectiveness of AFT in XGBoost with respect to a number of baselines, in two respects: generalization performance and training speed. Furthermore, we take advantage of the support for NVIDIA GPUs in XGBoost to achieve substantial speedup over multi-core CPUs. To our knowledge, our work is the first implementation of AFT that utilizes the processing power of NVIDIA GPUs. Starting from the 1.2.0 release, the XGBoost package natively supports the AFT model. The addition of AFT in XGBoost has had significant impact in the open source community, and a few statistics packages now utilize the XGBoost AFT model.
生存回归被应用于估算事件发生时间与特征变量之间的关系,其在医学、市场营销、风险管理及销售管理等领域具有重要应用价值。在诸如XGBoost、scikit-learn、LightGBM及CatBoost等库中实现的非线性基于树的机器学习算法,在实际应用中往往比线性模型更为精确。然而,现有最先进的基于树的模型实现对于生存回归的支持有限。在本项研究中,我们针对XGBoost实现了加速失效时间(AFT)模型的损失函数,以增强对各类标签截尾的生存建模支持。通过真实及模拟实验,我们展示了AFT在XGBoost中的有效性,与多个基线相比,在泛化性能和训练速度两方面均表现出显著优势。此外,我们充分利用了XGBoost对NVIDIA GPU的支持,实现了相较于多核CPU的显著加速。据我们所知,我们的工作是首个利用NVIDIA GPU处理能力的AFT实现。自1.2.0版本起,XGBoost软件包已原生支持AFT模型。AFT在XGBoost中的加入对开源社区产生了深远影响,并且已有数个统计学软件包开始使用XGBoost的AFT模型。
提供机构:
Taylor & Francis



