A novel model for malaria prediction based on ensemble algorithms
收藏Figshare2019-12-26 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/A_novel_model_for_malaria_prediction_based_on_ensemble_algorithms/11463666
下载链接
链接失效反馈官方服务:
资源简介:
Background and objectiveMost previous studies adopted single traditional time series models to predict incidences of malaria. A single model cannot effectively capture all the properties of the data structure. However, a stacking architecture can solve this problem by combining distinct algorithms and models. This study compares the performance of traditional time series models and deep learning algorithms in malaria case prediction and explores the application value of stacking methods in the field of infectious disease prediction.MethodsThe ARIMA, STL+ARIMA, BP-ANN and LSTM network models were separately applied in simulations using malaria data and meteorological data in Yunnan Province from 2011 to 2017. We compared the predictive performance of each model through evaluation measures: RMSE, MASE, MAD. In addition, gradient-boosting regression trees (GBRTs) were used to combine the above four models. We also determined whether stacking structure improved the model prediction performance.ResultsThe root mean square errors (RMSEs) of the four sub-models were 13.176, 14.543, 9.571 and 7.208; the mean absolute scaled errors (MASEs) were 0.469, 0.472, 0.296 and 0.266 and the mean absolute deviation (MAD) were 6.403, 7.658, 5.871 and 5.691. After using the stacking architecture combined with the above four models, the RMSE, MASE and MAD values of the ensemble model decreased to 6.810, 0.224 and 4.625, respectively.ConclusionsA novel ensemble model based on the robustness of structured prediction and model combination through stacking was developed. The findings suggest that the predictive performance of the final model is superior to that of the other four sub-models, indicating that stacking architecture may have significant implications in infectious disease prediction.
研究背景与目的:既往多数研究采用单一传统时间序列模型预测疟疾发病例数,然而单一模型无法有效捕捉数据结构的全部特征。堆叠架构可通过融合不同算法与模型解决该局限。本研究对比传统时间序列模型与深度学习算法在疟疾病例预测中的性能表现,并探讨堆叠方法在传染病预测领域的应用价值。
研究方法:本研究分别采用自回归积分滑动平均模型(ARIMA)、STL+ARIMA、反向传播人工神经网络(BP-ANN)以及长短期记忆网络(LSTM)模型,以2011至2017年云南省疟疾发病数据与气象数据开展模拟实验。通过均方根误差(RMSE)、平均绝对标度误差(MASE)与平均绝对偏差(MAD)三项评估指标对比各模型的预测性能。此外,本研究采用梯度提升回归树(GBRTs)对上述四种模型进行集成,并验证堆叠结构是否可提升模型预测性能。
研究结果:四个子模型的均方根误差(RMSE)分别为13.176、14.543、9.571与7.208;平均绝对标度误差(MASE)分别为0.469、0.472、0.296与0.266;平均绝对偏差(MAD)分别为6.403、7.658、5.871与5.691。采用堆叠架构融合上述四种模型后,集成模型的RMSE、MASE与MAD值分别降至6.810、0.224与4.625。
研究结论:本研究构建了一种基于结构化预测鲁棒性与堆叠式模型融合的新型集成模型。结果显示,最终集成模型的预测性能优于上述四个子模型,表明堆叠架构在传染病预测领域具有重要的应用价值。
创建时间:
2019-12-26



