Overview of the covariates.

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://figshare.com/articles/dataset/Overview_of_the_covariates_/25384506

下载链接

链接失效反馈

官方服务：

资源简介：

Slow patient enrollment or failing to enroll the required number of patients is a disruptor of clinical trial timelines. To meet the planned trial recruitment, site selection strategies are used during clinical trial planning to identify research sites that are most likely to recruit a sufficiently high number of subjects within trial timelines. We developed a machine learning approach that outperforms baseline methods to rank research sites based on their expected recruitment in future studies. Indication level historical recruitment and real-world data are used in the machine learning approach to predict patient enrollment at site level. We define covariates based on published recruitment hypotheses and examine the effect of these covariates in predicting patient enrollment. We compare model performance of a linear and a non-linear machine learning model with common industry baselines that are constructed from historical recruitment data. Performance of the methodology is evaluated and reported for two disease indications, inflammatory bowel disease and multiple myeloma, both of which are actively being pursued in clinical development. We validate recruitment hypotheses by reviewing the covariates relationship with patient recruitment. For both indications, the non-linear model significantly outperforms the baselines and the linear model on the test set. In this paper, we present a machine learning approach to site selection that incorporates site-level recruitment and real-world patient data. The model ranks research sites by predicting the number of recruited patients and our results suggest that the model can improve site ranking compared to common industry baselines.

患者入组进度迟缓或无法达到计划入组人数，是临床试验时间线的主要干扰因素。为达成试验计划的入组目标，临床试验规划阶段通常会采用研究中心选择策略，以识别出最有可能在试验周期内招募到足量受试者的研究中心。本研究开发了一种机器学习方法，可基于研究中心在未来研究中的预期入组情况对其进行排序，且该方法的性能优于基线方法。本机器学习方法采用适应症层面的历史入组数据与真实世界数据(Real World Data, RWD)，对研究中心层面的患者入组情况进行预测。我们基于已发表的入组相关假说定义协变量，并检验这些协变量在患者入组预测中的作用效果。我们将线性与非线性机器学习模型的性能，与基于历史入组数据构建的行业通用基线方法进行对比。本研究针对两种疾病适应症——炎症性肠病(Inflammatory Bowel Disease, IBD)与多发性骨髓瘤(Multiple Myeloma, MM)——评估并报告了所提方法的性能，这两种适应症均处于活跃的临床开发阶段。我们通过分析协变量与患者入组的关联关系，对入组相关假说进行了验证。在两种适应症的测试集上，非线性模型的性能均显著优于基线方法与线性模型。本文提出了一种面向研究中心选择的机器学习方法，该方法整合了研究中心层面的入组数据与真实世界患者数据。该模型通过预测招募患者数量对研究中心进行排序，研究结果表明，相较于行业通用基线方法，该模型可优化研究中心的排序结果。

创建时间：

2024-03-11

5,000+

优质数据集

54 个

任务类型

进入经典数据集