Table_2_Machine learning and feature extraction for rapid antimicrobial resistance prediction of Acinetobacter baumannii from whole-genome sequencing data.XLSX
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Table_2_Machine_learning_and_feature_extraction_for_rapid_antimicrobial_resistance_prediction_of_Acinetobacter_baumannii_from_whole-genome_sequencing_data_XLSX/24979347
下载链接
链接失效反馈官方服务:
资源简介:
BackgroundWhole-genome sequencing (WGS) has contributed significantly to advancements in machine learning methods for predicting antimicrobial resistance (AMR). However, the comparisons of different methods for AMR prediction without requiring prior knowledge of resistance remains to be conducted.
MethodsWe aimed to predict the minimum inhibitory concentrations (MICs) of 13 antimicrobial agents against Acinetobacter baumannii using three machine learning algorithms (random forest, support vector machine, and XGBoost) combined with k-mer features extracted from WGS data.
ResultsA cohort of 339 isolates was used for model construction. The average essential agreement and category agreement of the best models exceeded 90.90% (95%CI, 89.03–92.77%) and 95.29% (95%CI, 94.91–95.67%), respectively; the exceptions being levofloxacin, minocycline and imipenem. The very major error rates ranged from 0.0 to 5.71%. We applied feature selection pipelines to extract the top-ranked 11-mers to optimise training time and computing resources. This approach slightly improved the prediction performance and enabled us to obtain prediction results within 10 min. Notably, when employing these top-ranked 11-mers in an independent test dataset (120 isolates), we achieved an average accuracy of 0.96.
ConclusionOur study is the first to demonstrate that AMR prediction for A. baumannii using machine learning methods based on k-mer features has competitive performance over traditional workflows; hence, sequence-based AMR prediction and its application could be further promoted. The k-mer-based workflow developed in this study demonstrated high recall/sensitivity and specificity, making it a dependable tool for MIC prediction in clinical settings.
背景 全基因组测序(Whole-genome sequencing, WGS)在推动抗菌药物耐药性(antimicrobial resistance, AMR)预测相关机器学习方法的发展中发挥了重要作用。然而,目前仍缺乏无需依赖耐药性先验知识的抗菌药物耐药性预测方法对比研究。
方法 本研究旨在结合从全基因组测序数据中提取的k-mer特征,采用三种机器学习算法(随机森林、支持向量机与XGBoost),预测13种抗菌药物对鲍曼不动杆菌(Acinetobacter baumannii)的最低抑菌浓度(minimum inhibitory concentrations, MICs)。
结果 本研究纳入339株细菌分离株用于模型构建。最优模型的平均基本一致率与类别一致率分别超过90.90%(95%置信区间,89.03–92.77%)与95.29%(95%置信区间,94.91–95.67%),仅左氧氟沙星、米诺环素与亚胺培南除外。极严重错误率介于0.0%~5.71%之间。本研究采用特征选择流程提取排名靠前的11-mer特征,以优化训练时长与计算资源消耗,该策略小幅提升了预测性能,并可将预测耗时控制在10分钟以内。值得注意的是,将上述排名靠前的11-mer特征应用于包含120株分离株的独立测试数据集时,模型平均准确率达到0.96。
结论 本研究首次证实,基于k-mer特征的机器学习方法用于鲍曼不动杆菌抗菌药物耐药性预测,其性能优于传统分析流程;因此,基于序列的抗菌药物耐药性预测及其应用可得到进一步推广。本研究开发的基于k-mer的分析流程展现出优异的召回率/灵敏度与特异度,可作为临床场景下最低抑菌浓度预测的可靠工具。
创建时间:
2024-01-11



