five

Key features of the XGBoost algorithm.

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Key_features_of_the_XGBoost_algorithm_/27163295
下载链接
链接失效反馈
官方服务:
资源简介:
To improve the accuracy and efficiency of box office prediction, this study deeply discusses the application of the optimized eXtreme Gradient Boosting (XGBoost) model in this scenario and its advantages compared with other commonly used machine learning models. By comparing and analyzing five models, involving the Deep Neural Network, Light Gradient Boosting Machine, Random Forest, Gradient Boosting Decision Tree, and CatBoost, several key performance indicators such as accuracy, precision, recall, F1 score, generalization error, stability, robustness, and adaptability score are comprehensively investigated. The research results reveal that the optimization model proposed in this study is superior to the comparison model in most evaluation indicators, especially when the data volume reaches 2500, showing obvious advantages. For example, the accuracy is increased to 0.9, the F1 score is 0.9, the generalization error is reduced to 0.09, and the stability score is as high as 0.98. The robustness and adaptability scores are both 0.97, which proves its strong prediction ability and high stability and robustness on large-scale datasets. Therefore, this study provides scientific data support and a decision-making basis for the film industry in the formulation of marketing and distribution strategies. Moreover, film producers and distributors can reasonably estimate their market performance early in film shooting, optimize investment decisions, and reduce economic risks through accurate box office predictions.

为提升票房预测的准确性与效率,本研究深入探讨了优化后的极限梯度提升(eXtreme Gradient Boosting, XGBoost)模型在该场景中的应用,以及其相较于其他主流机器学习模型的优势。本研究对比分析了五类模型,包括深度神经网络(Deep Neural Network)、轻量梯度提升机(Light Gradient Boosting Machine)、随机森林(Random Forest)、梯度提升决策树(Gradient Boosting Decision Tree)以及CatBoost,并对准确率、精确率、召回率、F1值、泛化误差、稳定性、鲁棒性以及适应性得分等多项关键性能指标展开了全面考察。研究结果表明,本研究提出的优化模型在多数评估指标上均优于对照模型,尤其在数据集规模达到2500时优势尤为显著。例如,其准确率提升至0.9,F1值为0.9,泛化误差降至0.09,稳定性得分高达0.98;鲁棒性与适应性得分均为0.97,充分证明该模型在大规模数据集上具备优异的预测能力与极高的稳定性与鲁棒性。因此,本研究为电影行业制定营销与发行策略提供了科学的数据支撑与决策依据。此外,影视制片方与发行方可通过精准的票房预测,在影片拍摄阶段即可合理预估市场表现、优化投资决策并降低经济风险。
创建时间:
2024-10-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作