five

Hybrid Feature Selection for COVID-19 Severity Prediction Using Cuckoo Search with SVM Framework

收藏
DataCite Commons2025-05-12 更新2025-05-17 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/MHU8EC
下载链接
链接失效反馈
官方服务:
资源简介:
Abstract Objective: The main objective of this study is to determine the most important blood test markers that may indicate the presence of COVID-19 in a patient. To utilize the Cuckoo Search algorithm with SVM to explore the feature space efficiently and select features that contribute significantly to the model's performance. Methods: A novel hybrid method for feature selection has been proposed with the goal of improving the predictive capabilities of Support Vector Machines (SVM) for determining COVID-19 severity. Blood test datasets are used in the implementation of this study. The dataset has been split into two parts: 80% for training and 20% for testing. First, we use two statistical measures, chi-squared and mutual information, from the filter approach to minimize the feature dimensions. As a wrapper for SVM, we then use a modified Cuckoo Search algorithm. To measure how well the proposed approach works, we used evaluation metrics such as accuracy, precision, recall, and F1 score. Findings: The SVM classifier achieved the best performance with the features obtained from the proposed hybrid method, and the SVM classifier obtained an accuracy of 92% using the blood test dataset. The outcomes demonstrate that our hybrid approach effectively picks a subset of features that makes the model simpler while also making it more accurate and faster to compute. Novelty: This research work proposes a new hybrid feature selection technique by combining filter and wrapper methods to find the best feature set. This combination is introduced for the first time in this type of work related to COVID-19 prediction in which the results of Chi-Square and Mutual Information are used by the modified Cuckoo-Search algorithm to find the top features pertaining to COVID-19 severity and also to improve the performance of SVM model. Keywords: Feature selection, Cuckoo search, Machine learning, Support Vector Machine (SVM), Severity prediction, Healthcare

摘要 研究目的:本研究的核心目标是筛选可指示患者感染新冠病毒(COVID-19)的关键血液检测标志物,并结合支持向量机(Support Vector Machine, SVM)与布谷鸟搜索(Cuckoo Search)算法,高效探索特征空间并选取对模型性能具有显著贡献的特征。 研究方法:本研究提出一种新型混合特征选择方法,旨在提升支持向量机对新冠感染严重程度的预测能力。本研究采用血液检测数据集开展实验,将数据集划分为80%训练集与20%测试集。首先通过过滤式特征选择方法中的卡方检验与互信息两种统计指标实现特征维度降维;随后以改进布谷鸟搜索算法作为支持向量机的包装式特征选择器。为评估所提方法的有效性,本研究采用准确率、精确率、召回率与F1值作为模型性能评价指标。 研究结果:采用本研究混合方法选取的特征训练的支持向量机分类器性能最优,在该血液检测数据集上的分类准确率可达92%。实验结果证实,所提混合方法可有效筛选出特征子集,在简化模型结构的同时提升模型预测精度与计算效率。 研究创新:本研究提出一种融合过滤式与包装式方法的新型混合特征选择技术,用于获取最优特征子集。此类特征选择组合方式在新冠感染预测相关研究中尚属首次:本研究通过改进布谷鸟搜索算法融合卡方检验与互信息的结果,筛选出与新冠感染严重程度相关的核心特征,并进一步优化支持向量机模型的性能。 关键词:特征选择、布谷鸟搜索、机器学习、支持向量机(SVM)、严重程度预测、医疗健康
提供机构:
Harvard Dataverse
创建时间:
2025-04-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作